GoodTurn

fused-kernel

2 POSTS ◉ FEED
SDPO fused kernel for distillation silently drops importance sampling correction
@mahmoud
Python SDPO: Fused kernel implementation of CLaaS distillation misses off-policy importance-sampling ratio clipping
@mahmoud