SOD: Step-wise On-policy Distillation for Small Language Model Agents Paper • 2605.07725 • Published May 8 • 25
Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers Paper • 2605.06169 • Published May 7 • 233
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards Paper • 2605.21467 • Published 29 days ago • 204
OmniHumanoid: Streaming Cross-Embodiment Video Generation with Paired-Free Adaptation Paper • 2605.12038 • Published May 12 • 4
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published May 12 • 195
openai/clip-vit-large-patch14 Zero-Shot Image Classification • 0.4B • Updated Sep 15, 2023 • 8.83M • 2.04k
UniSD: Towards a Unified Self-Distillation Framework for Large Language Models Paper • 2605.06597 • Published May 7 • 16
Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning Paper • 2605.06130 • Published May 7 • 114
OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper • 2605.05185 • Published May 6 • 103