In a Training Loop 🔄

Urro PRO

urroxyz

https://urro.xyz/

urroxyz

AI & ML interests

computational linguistics major 🤖🔎🔠 i am autistic. if i come off rude, i probably didn't mean to. please feel free to ask me for clarification.

Recent Activity

updated a collection 17 minutes ago

WTF GENIUS PAPERS

upvoted a paper 17 minutes ago

From Growing to Looping: A Unified View of Iterative Computation in LLMs

commentedon an article 29 minutes ago

EMO: Pretraining mixture of experts for emergent modularity

View all activity

Organizations

upvoted a paper 17 minutes ago

From Growing to Looping: A Unified View of Iterative Computation in LLMs

Paper • 2602.16490 • Published Feb 18 • 1

upvoted a paper 33 minutes ago

EMO: Pretraining Mixture of Experts for Emergent Modularity

Paper • 2605.06663 • Published 2 days ago • 2

upvoted a collection about 2 hours ago

EMO

Collection

8 items • Updated about 7 hours ago • 7

upvoted an article about 2 hours ago

Article

EMO: Pretraining mixture of experts for emergent modularity

about 8 hours ago

•

upvoted 16 papers about 7 hours ago

PILOT: Planning via Internalized Latent Optimization Trajectories for Large Language Models

Paper • 2601.19917 • Published Jan 7 • 2

Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment

Paper • 2601.14249 • Published Jan 20 • 14

Improving Reasoning Capabilities in Small Models through Mixture-of-Layers Distillation with Stepwise Attention on Key Information

Paper • 2604.15701 • Published 22 days ago • 1

On-Policy Self-Distillation for Reasoning Compression

Paper • 2603.05433 • Published Mar 5 • 9

Student-in-the-Loop Chain-of-Thought Distillation via Generation-Time Selection

Paper • 2604.02819 • Published Apr 3 • 1

Skip-Thinking: Chunk-wise Chain-of-Thought Distillation Enable Smaller Language Models to Reason Better and Faster

Paper • 2505.18642 • Published May 24, 2025 • 1

LIMO: Less is More for Reasoning

Paper • 2502.03387 • Published Feb 5, 2025 • 63

Pitfalls of Rule- and Model-based Verifiers -- A Case Study on Mathematical Reasoning

Paper • 2505.22203 • Published May 28, 2025 • 7

Beyond Outcome Verification: Verifiable Process Reward Models for Structured Reasoning

Paper • 2601.17223 • Published Jan 23 • 1

Solve-Detect-Verify: Inference-Time Scaling with Flexible Generative Verifier

Paper • 2505.11966 • Published May 17, 2025 • 6

Reinforcing General Reasoning without Verifiers

Paper • 2505.21493 • Published May 27, 2025 • 27

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6, 2025 • 192

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Paper • 2504.20571 • Published Apr 29, 2025 • 99

Urro PRO

AI & ML interests

Recent Activity

Organizations

urroxyz's activity

EMO: Pretraining mixture of experts for emergent modularity