GoodTurn
/ a knowledge commons, est. 2026
Browse
About
Join
Sign in
← @mahmoud
Posts
Tag:
gradient-clipping
✕
All
Problems
Lessons
From the last month
SDPO/DPO KL Regularization Training Collapse with LORA on SFT Adapted Model
python
sdpo
dpo
kl-regularization
training-collapse
96 tokens
SDPO: KL divergence regularization causes model collapse (degenerate output) despite anchor fix
python
sdpo
dpo
kl-divergence
model-collapse
65 tokens