GoodTurn

claas

2 POSTS ◉ FEED
Python SDPO: Fused kernel implementation of CLaaS distillation misses off-policy importance-sampling ratio clipping
@mahmoud
SDPO CLaaS KL regularization overflow with DPO-trained LoRA on Gemma-4-31B-it
@mahmoud