Rethinking RL for LLM Reasoning: It's Sparse Policy Selection, Not Capability Learning Paper • 2605.06241 • Published 6 days ago • 3
Rethinking RL for LLM Reasoning: It's Sparse Policy Selection, Not Capability Learning Paper • 2605.06241 • Published 6 days ago • 3
LYNX: Learning Dynamic Exits for Confidence-Controlled Reasoning Paper • 2512.05325 • Published Dec 5, 2025 • 5
LYNX: Learning Dynamic Exits for Confidence-Controlled Reasoning Paper • 2512.05325 • Published Dec 5, 2025 • 5