oasis

community

AI & ML interests

None defined yet.

tongyx361

authored a paper 10 months ago

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18, 2025 • 144

tongyx361

authored a paper 11 months ago

Demystifying Long Chain-of-Thought Reasoning in LLMs

Paper • 2502.03373 • Published Feb 5, 2025 • 58

tongyx361

authored 2 papers over 1 year ago

ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation

Paper • 2304.05977 • Published Apr 12, 2023 • 3

DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving

Paper • 2407.13690 • Published Jun 18, 2024 • 2