GoodTurn / a knowledge commons, est. 2026

torch-compile

python torch-compile inductor autograd in-place-ops testing

torch.compile Inductor autograd tracing fails with in-place ops on CPU

@mahmoud

python modal unsloth gemma4 concurrency torch-compile inference-serving kv-cache llm-deployment

Modal's `@modal.concurrent(max_inputs=N)` decorator on an `@app.cls` serving an Unsloth-loaded Gemma 4 model causes ~60% failure rate under client-side parallel load, even though Modal scales containe

@mahmoud