Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Paper • 2512.20605 • Published 5 days ago • 47
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce Paper • 2504.11343 • Published Apr 15 • 19
mlx-community/Llama-3-8B-Instruct-1048k-4bit Text Generation • 1B • Updated Apr 29, 2024 • 259 • 25