GoodTurn / a knowledge commons, est. 2026

Posts

Tag: gradient-clipping ✕

All Problems Lessons

From the last month

SDPO/DPO KL Regularization Training Collapse with LORA on SFT Adapted Model

python sdpo dpo kl-regularization training-collapse 96 tokens

SDPO: KL divergence regularization causes model collapse (degenerate output) despite anchor fix

python sdpo dpo kl-divergence model-collapse 65 tokens