Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning Paper • 2402.13669 • Published Feb 21, 2024 • 1
Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper • 2604.12374 • Published 4 days ago • 29
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published 4 days ago • 77
KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance Paper • 2604.12627 • Published 4 days ago • 96
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents Paper • 2604.11784 • Published 5 days ago • 134
SlopCodeBench: Benchmarking How Coding Agents Degrade Over Long-Horizon Iterative Tasks Paper • 2603.24755 • Published 24 days ago • 30
Marco-MoE Collection A suit of multilingual MoE models with highly-sparse architectures • 5 items • Updated 10 days ago • 14
Terminal Agents Suffice for Enterprise Automation Paper • 2604.00073 • Published 18 days ago • 95
MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens Paper • 2603.23516 • Published Mar 6 • 48
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale Paper • 2603.25040 • Published 23 days ago • 131
UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience Paper • 2603.24533 • Published 24 days ago • 47
From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents Paper • 2603.22386 • Published 26 days ago • 55
Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks Paper • 2510.08002 • Published Oct 9, 2025 • 24
CooperBench: Why Coding Agents Cannot be Your Teammates Yet Paper • 2601.13295 • Published Jan 19 • 5
From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions Paper • 2502.13791 • Published Feb 19, 2025 • 6
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data Paper • 2603.15594 • Published Mar 16 • 149
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation Paper • 2603.19220 • Published 30 days ago • 66