10 10

Xiangyu

xixy

https://xixy.github.io/

AI & ML interests

None yet

Recent Activity

new activity 7 days ago

XenArcAI/MathX-5M:What is the model used to produce responses?

upvoted a collection 13 days ago

OpenReasoning-Nemotron

upvoted a paper about 2 months ago

AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy

View all activity

Organizations

None yet

New activity in XenArcAI/MathX-5M 7 days ago

What is the model used to produce responses?

#3 opened 7 days ago by

xixy

upvoted a collection 13 days ago

OpenReasoning-Nemotron

Collection

Collection of models for OpenReasoning-Nemotron which are trained on 5M reasoning traces for Math, Code and Science. • 6 items • Updated 6 days ago • 39

upvoted a paper about 2 months ago

AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy

Paper • 2506.13284 • Published Jun 16 • 24

New activity in a-m-team/AM-DeepSeek-R1-0528-Distilled about 2 months ago

什么叫中国速度！

#1 opened 2 months ago by

reign12

commented a paper about 2 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 253 •

New activity in a-m-team/AM-DeepSeek-R1-0528-Distilled about 2 months ago

请问有code相关的评测结果吗？

#2 opened about 2 months ago by

xixy

commented a paper 2 months ago

Why Distillation can Outperform Zero-RL: The Role of Flexible Reasoning

Paper • 2505.21067 • Published May 27 • 3 •

authored a paper 2 months ago

Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective

Paper • 2505.17652 • Published May 23 • 6

upvoted a paper 2 months ago

Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective

Paper • 2505.17652 • Published May 23 • 6

upvoted a paper 3 months ago

Not All Correct Answers Are Equal: Why Your Distillation Source Matters

Paper • 2505.14464 • Published May 20 • 9

commented a paper 3 months ago

Model Merging in Pre-training of Large Language Models

Paper • 2505.12082 • Published May 17 • 39 •

upvoted a collection 3 months ago

Qwen3

Collection

80 items • Updated 27 minutes ago • 1.01k

authored a paper 5 months ago

SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity

Paper • 2503.01506 • Published Mar 3 • 9

upvoted a paper 5 months ago

SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity

Paper • 2503.01506 • Published Mar 3 • 9

commented a paper 9 months ago

IOPO: Empowering LLMs with Complex Instruction Following via Input-Output Preference Optimization

Paper • 2411.06208 • Published Nov 9, 2024 • 21 •

upvoted a paper 11 months ago

Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems

Paper • 2408.16293 • Published Aug 29, 2024 • 28

commented a paper about 1 year ago

Scaling Synthetic Data Creation with 1,000,000,000 Personas

Paper • 2406.20094 • Published Jun 28, 2024 • 104 •

upvoted 2 papers about 1 year ago

Chameleon: Mixed-Modal Early-Fusion Foundation Models

Paper • 2405.09818 • Published May 16, 2024 • 131

CodeShell Technical Report

Paper • 2403.15747 • Published Mar 23, 2024 • 1

upvoted a paper over 1 year ago

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11, 2024 • 94

Xiangyu

AI & ML interests

Recent Activity

Organizations

xixy's activity

What is the model used to produce responses?

什么叫中国速度！

请问有code相关的评测结果吗？