22 43 191

Zhangchen Xu PRO

zhangchenxu

https://zhangchenxu.com/

AI & ML interests

LLM Data, Alignment, Post-Training, Safety

Recent Activity

liked a model 9 days ago

ibm-granite/granite-4.0-h-tiny

liked a model 12 days ago

PaddlePaddle/PaddleOCR-VL

upvoted a paper 13 days ago

ToolPRMBench: Evaluating and Advancing Process Reward Models for Tool-using Agents

View all activity

Organizations

Collections 1

Papers 12

spaces 2

TinyV

💬

Verify model answers against ground truth

Chat With Magpie

💬

Generate responses in a chat with a friendly bot

models 40

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step312

4B • Updated Jul 30, 2025

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step288

4B • Updated Jul 30, 2025

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step256

4B • Updated Jul 30, 2025

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step224

4B • Updated Jul 30, 2025

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step192

4B • Updated Jul 30, 2025

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step160

4B • Updated Jul 30, 2025 • 1

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step128

4B • Updated Jul 30, 2025

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step96

4B • Updated Jul 30, 2025

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step64

4B • Updated Jul 30, 2025

zhangchenxu/deepseek-math-7b-instruct-deepscaler_5k_prime_step468

7B • Updated Jul 30, 2025

View 40 models

datasets 14

Zhangchen Xu PRO

AI & ML interests

Recent Activity

Organizations

Collections 1

TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning

TinyV

zhangchenxu/TinyV-Qwen3-1.7B

zhangchenxu/TinyV-Qwen3-1.7B-Think

TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning

TinyV

zhangchenxu/TinyV-Qwen3-1.7B

zhangchenxu/TinyV-Qwen3-1.7B-Think

Papers 12

spaces 2

TinyV

Chat With Magpie

models 40

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step312

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step288

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step256

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step224

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step192

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step160

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step128

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step96

zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step64

zhangchenxu/deepseek-math-7b-instruct-deepscaler_5k_prime_step468

datasets 14

zhangchenxu/HardVerify-Math

zhangchenxu/TinyV_Think_Training_Data_Qwen3_Balanced

zhangchenxu/TinyV_Training_Data_Qwen3_Balanced

zhangchenxu/bigmath_tinyv_filtered

zhangchenxu/TinyV_Training_Data_Balanced

zhangchenxu/TinyV_Think_Training_Data_Balanced

zhangchenxu/KodCode_50K_R1

zhangchenxu/KodCode_Hard_18K_R1

zhangchenxu/Magpie-100k-Gemma2-9B

zhangchenxu/zero-eval

Zhangchen Xu PRO

AI & ML interests

Recent Activity

Organizations

Collections 1

TinyV

TinyV

Papers 12

spaces 2 Sort: Recently updated

TinyV

Chat With Magpie

models 40 Sort: Recently updated

datasets 14 Sort: Recently updated

spaces 2

models 40

datasets 14