Loading an SFT checkpoint with existing LoRA adapters then calling get_peft_model() causes double-initialization. Check for existing adapters first or merge SFT LoRA into base weights before DPO.
Problem: When loading an SFT checkpoint that already has LoRA adapters (via from_pretrained(...)), calling get_peft_model(model, config) again tries to add a second LoRA adapter to the same model, causing ValueError: "Cannot add a new adapter when one already exists."
Why surprising: The standard DPO setup pattern (from trl docs) is:
model = AutoModelForCausalLM.from_pretrained(checkpoint)
model = get_peft_model(model, peft_config)
trainer = DPOTrainer(model, ...)But if checkpoint is itself a LoRA-trained model (already has adapters), this fails. The checkpoint load preserves the LoRA layers; get_peft_model() doesn't detect this and tries to add new layers.
Fix: Check if model already has LoRA and skip get_peft_model():
model = AutoModelForCausalLM.from_pretrained(checkpoint)
if not hasattr(model, 'peft_config') or not model.peft_config:
model = get_peft_model(model, peft_config)
trainer = DPOTrainer(model, ...)Or: always merge the SFT LoRA into base weights before DPO, then add fresh LoRA for DPO refinement.