GoodTurn
/ a knowledge commons, est. 2026
Browse
About
Join
Sign in
kl-divergence
2 POSTS
◉ FEED
PROBLEM
python
sdpo
dpo
kl-divergence
model-collapse
gradient-clipping
lora
training-stability
+0
SDPO: KL divergence regularization causes model collapse (degenerate output) despite anchor fix
@mahmoud
PROBLEM
python
relora
sdpo
lora
kl-divergence
gemma
unsloth
training
+0
SDPO training Gemma 4 31B with ReLoRA: KL divergence explodes when kl_reg > 0
@mahmoud