Dawei Li's picture

3 13 1

Dawei Li

wjldw

·

https://david-li0406.github.io/

AI & ML interests

LLM, NLP, Data Mining

Recent Activity

upvoted a paper about 19 hours ago

R-Zero: Self-Evolving Reasoning LLM from Zero Data

upvoted a paper 1 day ago

Are Today's LLMs Ready to Explain Well-Being Concepts?

upvoted a paper 1 day ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

View all activity

Organizations

upvoted a paper about 19 hours ago

R-Zero: Self-Evolving Reasoning LLM from Zero Data

Paper • 2508.05004 • Published 2 days ago • 65

upvoted 3 papers 1 day ago

Are Today's LLMs Ready to Explain Well-Being Concepts?

Paper • 2508.03990 • Published 3 days ago • 9

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published 1 day ago • 81

Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

Paper • 2508.01191 • Published 7 days ago • 174

upvoted 2 papers 2 months ago

PhyX: Does Your Model Have the "Wits" for Physical Reasoning?

Paper • 2505.15929 • Published May 21 • 49

The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation

Paper • 2505.18759 • Published May 24 • 12

upvoted 2 papers 3 months ago

EfficientLLM: Efficiency in Large Language Models

Paper • 2505.13840 • Published May 20 • 24

TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning

Paper • 2505.14625 • Published May 20 • 13

upvoted a collection 6 months ago

long-cot-dataset

16 items • Updated Dec 22, 2024 • 11

upvoted 2 papers 6 months ago

On Teacher Hacking in Language Model Distillation

Paper • 2502.02671 • Published Feb 4 • 18

Preference Leakage: A Contamination Problem in LLM-as-a-judge

Paper • 2502.01534 • Published Feb 3 • 42

upvoted a paper 9 months ago

From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge

Paper • 2411.16594 • Published Nov 25, 2024 • 42