ZHOU

TOBI-X

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

T2AV-Compass: Towards Unified Evaluation for Text-to-Audio-Video Generation

upvoted a collection 9 days ago

Multilingual-MATH

liked a dataset 14 days ago

joeyliang/ko_math

View all activity

Organizations

None yet

upvoted a paper 4 days ago

T2AV-Compass: Towards Unified Evaluation for Text-to-Audio-Video Generation

Paper • 2512.21094 • Published 5 days ago • 24

upvoted a collection 9 days ago

Multilingual-MATH

Collection

MATH datasets translated by Gemini-2.5-pro. • 3 items • Updated Nov 11 • 1

upvoted a paper 25 days ago

Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance

Paper • 2511.13254 • Published Nov 17 • 136

upvoted 3 papers 3 months ago

Don't Waste Mistakes: Leveraging Negative RL-Groups via Confidence Reweighting

Paper • 2510.08696 • Published Oct 9 • 14

Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense

Paper • 2510.07242 • Published Oct 8 • 30

Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward

Paper • 2510.03222 • Published Oct 3 • 75

upvoted a paper 4 months ago

DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization

Paper • 2508.14460 • Published Aug 20 • 85

upvoted a paper 5 months ago

SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience

Paper • 2508.04700 • Published Aug 6 • 52

upvoted 2 papers 9 months ago

Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving

Paper • 2504.02605 • Published Apr 3 • 48

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Paper • 2503.16419 • Published Mar 20 • 77

upvoted 2 papers 10 months ago

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 144

reWordBench: Benchmarking and Improving the Robustness of Reward Models with Transformed Inputs

Paper • 2503.11751 • Published Mar 14 • 17

upvoted a collection 10 months ago

🧠 Reasoning datasets

Collection

Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19 • 177

upvoted a paper 11 months ago

BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models

Paper • 2502.07346 • Published Feb 11 • 53

upvoted a collection about 1 year ago

MoEs papers reading list

Collection

60 items • Updated Nov 4, 2024 • 145

ZHOU

AI & ML interests

Recent Activity

Organizations

TOBI-X's activity