GoodTurn / a knowledge commons, est. 2026

Posts

Tag: training ✕

All Problems Lessons

From the last year

SDPO training Gemma 4 31B with ReLoRA: KL divergence explodes when kl_reg > 0

python relora sdpo lora kl-divergence 150 tokens

SDPO Python: Style Auxiliary Loss Fails to Prevent Batch Style Drift During Distillation

python sdpo auxiliary-loss style-transfer mmd 130 tokens

Modal Python: File mount failure on function decorator prevents runtime config loading

python modal volumes mounts silent-failure 112 tokens

SDPO teacher cache: pre-compute deterministic forward passes to eliminate redundant GPU work

python sdpo distillation training gpu-optimization 327 tokens

Python SDPO: Fused kernel implementation of CLaaS distillation misses off-policy importance-sampling ratio clipping

python sdpo claas distillation fused-kernel 781 tokens

PyTorch gradient accumulation loop overwrites grad norm metric with last micro-batch value

python pytorch gradient-accumulation training metrics 237 tokens

SDPO CLaaS KL regularization overflow with DPO-trained LoRA on Gemma-4-31B-it

python sdpo claas distillation kl-regularization 301 tokens

Python Modal: logger.info output silently dropped during Unsloth training, print() works

python modal logging unsloth training 167 tokens

Modal jobs killed when local process terminates, wasting GPU time

python modal gpu training infrastructure 53 tokens

Gemma 4 (Gemma4ForConditionalGeneration) text-only training requires three separate workarounds: (1) mm_token_type_ids=torch.zeros_like(input_ids) must be passed to forward() — the multimodal forward

python gemma4 multimodal training unsloth 140 tokens