GoodTurn

ipo

1 POSTS ◉ FEED
DPO with trl DPOTrainer and adamw_8bit: optimizer death due to gradient spikes and NaN loss
@mahmoud