XYX's picture

XYX

xuyd16

AI & ML interests

None yet

Recent Activity

authored a paper about 13 hours ago

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

upvoted a paper about 18 hours ago

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

submitted a paper about 18 hours ago

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

View all activity

Organizations

None yet

xuyd16 's datasets

None public yet