16 15 19

GtZeng PRO

chaoscodes

AI & ML interests

None yet

Recent Activity

liked a dataset about 16 hours ago

elefantai/p2p-full-data

upvoted a paper 1 day ago

Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs

upvoted a paper 1 day ago

Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning

View all activity

Organizations

liked a dataset about 16 hours ago

elefantai/p2p-full-data

Updated 5 days ago • 7.13k • 10

upvoted 2 papers 1 day ago

Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs

Paper • 2601.08763 • Published 4 days ago • 120

Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning

Paper • 2601.09667 • Published 3 days ago • 69

updated a model 10 days ago

AgentCPT/Qwen3-4B_thinking_agent_sft_nemotron_tool_calling_v2_lr1e-5_epoch_1_ctx_16384_bs_256

4B • Updated 10 days ago • 12

published a model 10 days ago

AgentCPT/Qwen3-4B_thinking_agent_sft_nemotron_tool_calling_v2_lr1e-5_epoch_1_ctx_16384_bs_256

4B • Updated 10 days ago • 12

updated 2 models 14 days ago

AgentCPT/qwen-8b-agent-sft

8B • Updated 14 days ago • 6

AgentCPT/qwen-4b-agent-sft

4B • Updated 14 days ago • 5

published 2 models 14 days ago

AgentCPT/qwen-8b-agent-sft

8B • Updated 14 days ago • 6

AgentCPT/qwen-4b-agent-sft

4B • Updated 14 days ago • 5

updated a model 22 days ago

FuxiAISGLab/nonhis_game_behavior_clone_model_qwen-VL-2B

2B • Updated 22 days ago • 7

published a model 22 days ago

FuxiAISGLab/nonhis_game_behavior_clone_model_qwen-VL-2B

2B • Updated 22 days ago • 7

updated a model 23 days ago

FuxiAISGLab/game_behavior_clone_model_qwen-VL-4B

5B • Updated 23 days ago • 8

published a model 23 days ago

FuxiAISGLab/game_behavior_clone_model_qwen-VL-4B

5B • Updated 23 days ago • 8

updated a model 23 days ago

FuxiAISGLab/game_behavior_clone_model_qwen-VL-2B

2B • Updated 23 days ago • 9

published a model 23 days ago

FuxiAISGLab/game_behavior_clone_model_qwen-VL-2B

2B • Updated 23 days ago • 9

updated a dataset 23 days ago

chaoscodes/game_behavior_cloning

Viewer • Updated 23 days ago • 318 • 18

published a dataset 23 days ago

chaoscodes/game_behavior_cloning

Viewer • Updated 23 days ago • 318 • 18

upvoted 2 papers about 2 months ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published Dec 2, 2025 • 251

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 99

updated a dataset 6 months ago

chaoscodes/filter_swe_smith

Viewer • Updated Jul 19, 2025 • 10.8k • 1

GtZeng PRO

AI & ML interests

Recent Activity

Organizations

chaoscodes's activity