Mingzhe Du's picture

Mingzhe Du PRO

Elfsong

·

https://mingzhe.space

Elfsong

AI & ML interests

Code Generation / Preference Alignment / Bias Mitigation

Recent Activity

upvoted a paper 21 days ago

BENCHAGENTS: Automated Benchmark Creation with Agent Interaction

liked a dataset 26 days ago

Elfsong/Mercury_Multilingual

upvoted a paper about 1 month ago

Charting and Navigating Hugging Face's Model Atlas

View all activity

Organizations

upvoted a paper 21 days ago

BENCHAGENTS: Automated Benchmark Creation with Agent Interaction

Paper • 2410.22584 • Published Oct 29, 2024 • 1

upvoted 7 papers about 1 month ago

Charting and Navigating Hugging Face's Model Atlas

Paper • 2503.10633 • Published Mar 13 • 86

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

Paper • 2506.05176 • Published Jun 5 • 62

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 166

VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos

Paper • 2505.23693 • Published May 29 • 56

Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization

Paper • 2505.23387 • Published May 29 • 7

Effi-Code: Unleashing Code Efficiency in Language Models

Paper • 2410.10209 • Published Oct 14, 2024 • 2

ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows

Paper • 2505.19897 • Published May 26 • 102

upvoted 2 papers about 2 months ago

GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning

Paper • 2505.11049 • Published May 16 • 59

WaterDrum: Watermarking for Data-centric Unlearning Metric

Paper • 2505.05064 • Published May 8 • 8

upvoted 2 papers 3 months ago

FlowReasoner: Reinforcing Query-Level Meta-Agents

Paper • 2504.15257 • Published Apr 21 • 46

PaperBench: Evaluating AI's Ability to Replicate AI Research

Paper • 2504.01848 • Published Apr 2 • 36

upvoted 3 papers 4 months ago

Rethinking the Influence of Source Code on Test Case Generation

Paper • 2409.09464 • Published Sep 14, 2024 • 1

AntiLeak-Bench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge

Paper • 2412.13670 • Published Dec 18, 2024 • 6

CodeArena: A Collective Evaluation Platform for LLM Code Generation

Paper • 2503.01295 • Published Mar 3 • 8

upvoted 2 papers 5 months ago

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published Feb 13 • 149

GuardReasoner: Towards Reasoning-based LLM Safeguards

Paper • 2501.18492 • Published Jan 30 • 88

upvoted a paper 7 months ago

Self-Play Preference Optimization for Language Model Alignment

Paper • 2405.00675 • Published May 1, 2024 • 28

upvoted a paper about 1 year ago

Mercury: An Efficiency Benchmark for LLM Code Synthesis

Paper • 2402.07844 • Published Feb 12, 2024 • 2