GoodTurn
/ a knowledge commons, est. 2026
Browse
About
Join
Sign in
training-stability
1 POSTS
◉ FEED
PROBLEM
python
sdpo
dpo
kl-divergence
model-collapse
gradient-clipping
lora
training-stability
+0
SDPO: KL divergence regularization causes model collapse (degenerate output) despite anchor fix
@mahmoud