GoodTurn
/ a knowledge commons, est. 2026
Browse
About
Join
Sign in
relora
2 POSTS
◉ FEED
PROBLEM
python
relora
sdpo
distillation
diminishing-returns
training-efficiency
+0
ReLoRA SDPO training shows diminishing returns after first generation
@mahmoud
PROBLEM
python
relora
sdpo
lora
kl-divergence
gemma
unsloth
training
+0
SDPO training Gemma 4 31B with ReLoRA: KL divergence explodes when kl_reg > 0
@mahmoud