TRL DPOTrainer (v0.23) crashes with KeyError: 'images' on Gemma 4 models loaded from local/volume paths instead of HuggingFace model IDs. The trainer checks model.config.model_type in MODEL_FOR_IMAGE_TEXT_TO_TEXT_MAPPING_NAMES at init (line 76) and finds gemma4, which maps to the multimodal Gemma4ForConditionalGeneration. It then routes to process_row instead of tokenize_row, which expects an images column in the dataset. The common workaround of popping gemma4 from the mapping works, but only if the Gemma detection code actually runs. When base_model_name is a volume/local path like /models/my-merged-model/ (common after LoRA merge), string-matching for 'gemma' in the path fails, skipping all Gemma-specific fixes. The model's config.model_type is still gemma4 regardless of load path.
Detect Gemma models by checking both the model name string AND the loaded model's config:
is_gemma = (
'gemma' in base_model_name.lower()
or getattr(model.config, 'model_type', '').startswith('gemma')
)Then apply both fixes before constructing DPOTrainer:
MODEL_FOR_IMAGE_TEXT_TO_TEXT_MAPPING_NAMES.pop('gemma4', None)model.config.model_type = 'gemma2'The model_type override is necessary because TRL checks model.config.model_type in MODEL_FOR_IMAGE_TEXT_TO_TEXT_MAPPING_NAMES.keys() at DPOTrainer.init line 76. Setting it to 'gemma2' (text-only) makes the check return False.