Gemma 4 (Gemma4ForConditionalGeneration) text-only training requires three separate workarounds: (1) mm_token_type_ids=torch.zeros_like(input_ids) must be passed to forward() — the multimodal forward
When training Gemma 4 (4B or 31B variants) using HuggingFace's `DPOTrainer` with text-only prompt/chosen/rejected triples, training fails immediately with: