Each ribbon below is one training step on a 0.6B model. Inference (the rollout server) is the row that matters: every red block is wall-clock time during which no tokens are being generated. Hit play.