22 42 185

Zhangchen Xu PRO

zhangchenxu

https://zhangchenxu.com/

AI & ML interests

LLM Data, Alignment, Post-Training, Safety

Recent Activity

liked a model 1 day ago

nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16

new activity about 1 month ago

Agent-Ark/Toucan-1.5M:Clarification on SFT dataset construction for reproducing results

upvoted a paper about 2 months ago

Efficient Long-context Language Model Training by Core Attention Disaggregation

View all activity

Organizations

Collections 1

Papers 12

spaces 2

TinyV

💬

Verify model answers against ground truth

Chat With Magpie

💬

Generate responses in a chat with a friendly bot

models 40

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step312

4B • Updated Jul 30 • 6

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step288

4B • Updated Jul 30 • 8

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step256

4B • Updated Jul 30 • 7

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step224

4B • Updated Jul 30 • 6

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step192

4B • Updated Jul 30 • 5

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step160

4B • Updated Jul 30 • 8

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step128

4B • Updated Jul 30 • 5

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step96

4B • Updated Jul 30 • 6

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step64

4B • Updated Jul 30 • 6

zhangchenxu/deepseek-math-7b-instruct-deepscaler_5k_prime_step468

7B • Updated Jul 30 • 5

View 40 models

datasets 14

Zhangchen Xu PRO

AI & ML interests

Recent Activity

Organizations

Collections 1

TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning

TinyV

zhangchenxu/TinyV-Qwen3-1.7B

zhangchenxu/TinyV-Qwen3-1.7B-Think

TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning

TinyV

zhangchenxu/TinyV-Qwen3-1.7B

zhangchenxu/TinyV-Qwen3-1.7B-Think

Papers 12

spaces 2

TinyV

Chat With Magpie

models 40

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step312

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step288

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step256

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step224

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step192

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step160

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step128

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step96

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step64

zhangchenxu/deepseek-math-7b-instruct-deepscaler_5k_prime_step468

datasets 14

zhangchenxu/HardVerify-Math

zhangchenxu/TinyV_Think_Training_Data_Qwen3_Balanced

zhangchenxu/TinyV_Training_Data_Qwen3_Balanced

zhangchenxu/bigmath_tinyv_filtered

zhangchenxu/TinyV_Training_Data_Balanced

zhangchenxu/TinyV_Think_Training_Data_Balanced

zhangchenxu/KodCode_50K_R1

zhangchenxu/KodCode_Hard_18K_R1

zhangchenxu/Magpie-100k-Gemma2-9B

zhangchenxu/zero-eval

Zhangchen Xu PRO

AI & ML interests

Recent Activity

Organizations

Collections 1

TinyV

TinyV

Papers 12

spaces 2 Sort: Recently updated

TinyV

Chat With Magpie

models 40 Sort: Recently updated

datasets 14 Sort: Recently updated

spaces 2

models 40

datasets 14