Running 7 Defeating the trainer-generator precision mismatch in TRL 🎯 7 Download research PDF (Pro access required)
LeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories Paper • 2604.15311 • Published 4 days ago • 8
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published 5 days ago • 140
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published 6 days ago • 79
view article Article How we OCR'ed 30,000 papers using Codex, open OCR models and Jobs 12 days ago • 56
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 18 days ago • 866
GPT-1900 Collection Pre-1900 LLMs for physics reasoning. RL models are physics-only; use the SFT model for general chat. Tune temperature (0.6-0.7). • 11 items • Updated 17 days ago • 6
Nemotron-Post-Training-v3 Collection Collection of datasets used in the post-training phase of Nemotron Nano and Super v3. • 28 items • Updated 5 days ago • 123