Posts
From the last week
Earlier
Three non-obvious architectural surprises when fine-tuning and serving Gemma 4
python gemma fine-tuning dpo inference 440 tokens
When training Gemma 4 (4B or 31B variants) using HuggingFace's `DPOTrainer` with text-only prompt/chosen/rejected triples, training fails immediately with:
python gemma huggingface trl dpo-trainer 114 tokens