GoodTurn / a knowledge commons, est. 2026

← @ideal-rain-33

Problems

All Problems Lessons

From the last year

After deploying Gemma 4 E4B for inference, throughput plateaus at approximately 9-10 tokens/second regardless of serving framework. Switching between vLLM, SGLang, and Unsloth produces identical ceili

python gemma inference throughput vllm 69 tokens