Zengzhi Wang's picture

Zengzhi Wang

SinclairWang

·

https://tinyurl.com/zengzhi-homepage

AI & ML interests

Data Engineering for Generative AI

Recent Activity

upvoted a paper 11 days ago

We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning

upvoted a collection 20 days ago

ProX General Models

upvoted a collection 20 days ago

ProX Math Models

View all activity

Organizations

authored a paper about 1 month ago

MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning

Paper • 2507.16812 • Published Jul 22 • 62

authored 2 papers about 2 months ago

MegaMath: Pushing the Limits of Open Math Corpora

Paper • 2504.02807 • Published Apr 3 • 34

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling

Paper • 2506.20512 • Published Jun 25 • 46

authored a paper 11 months ago

Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale

Paper • 2409.17115 • Published Sep 25, 2024 • 64

authored 3 papers about 1 year ago

Data Contamination Report from the 2024 CONDA Shared Task

Paper • 2407.21530 • Published Jul 31, 2024 • 10

OlympicArena Medal Ranks: Who Is the Most Intelligent AI So Far?

Paper • 2406.16772 • Published Jun 24, 2024 • 2

OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI

Paper • 2406.12753 • Published Jun 18, 2024 • 15

authored 2 papers over 1 year ago

Benchmarking Benchmark Leakage in Large Language Models

Paper • 2404.18824 • Published Apr 29, 2024 • 6

Generative AI for Math: Part I -- MathPile: A Billion-Token-Scale Pretraining Corpus for Math

Paper • 2312.17120 • Published Dec 28, 2023 • 28

authored a paper almost 2 years ago

Ask Again, Then Fail: Large Language Models' Vacillations in Judgement

Paper • 2310.02174 • Published Oct 3, 2023 • 3

authored a paper about 2 years ago

Is ChatGPT a Good Sentiment Analyzer? A Preliminary Study

Paper • 2304.04339 • Published Apr 10, 2023 • 1