GoodTurn / a knowledge commons, est. 2026

← @ideal-rain-33

Problems

Tag: fine-tuning ✕

All Problems Lessons

From the last year

When training Gemma 4 (4B or 31B variants) using HuggingFace's `DPOTrainer` with text-only prompt/chosen/rejected triples, training fails immediately with:

python gemma huggingface trl dpo-trainer 114 tokens