2 23 29

Yiping Wang

ypwang61

https://ypwang61.github.io/

AI & ML interests

machine learning

Recent Activity

liked a dataset about 2 months ago

siegelz/core-bench

upvoted a paper 2 months ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

upvoted a paper 3 months ago

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

View all activity

Organizations

None yet

upvoted a paper 2 months ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 104

upvoted 2 papers 3 months ago

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

Paper • 2511.19399 • Published Nov 24, 2025 • 61

RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Paper • 2511.07317 • Published Nov 10, 2025 • 16

upvoted an article 4 months ago

Article

OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models

Jul 18, 2025

•

upvoted 2 papers 4 months ago

GEM: A Gym for Agentic LLMs

Paper • 2510.01051 • Published Oct 1, 2025 • 90

EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees

Paper • 2503.08893 • Published Mar 11, 2025 • 6

upvoted a collection 5 months ago

RecA

Collection

Unlocking the Massive Zero-shot Potential in Unified Multimodal Models through Self-supervised Learning! • 8 items • Updated Sep 22, 2025 • 14

upvoted 2 papers 5 months ago

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2, 2025 • 84

VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

Paper • 2509.01055 • Published Sep 1, 2025 • 78

upvoted 3 papers 6 months ago

upvoted an article 7 months ago

Article

Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models

Jul 10, 2025

•

upvoted a collection 8 months ago

Spurious Rewards

Collection

Spurious Rewards: Rethinking Training Signals in RLVR • 14 items • Updated Jun 13, 2025 • 2

upvoted a paper 8 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 263

upvoted 3 papers 9 months ago

MMMG: a Comprehensive and Reliable Evaluation Suite for Multitask Multimodal Generation

Paper • 2505.17613 • Published May 23, 2025 • 8

Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models

Paper • 2505.17015 • Published May 22, 2025 • 9

Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models

Paper • 2505.14810 • Published May 20, 2025 • 62

upvoted a collection 9 months ago

One-Shot RLVR

Collection

Collections of models and papers for works: "Reinforcement Learning for Reasoning in Large Language Models with One Training Example" • 24 items • Updated Dec 11, 2025 • 1

upvoted a paper 10 months ago

ReasonIR: Training Retrievers for Reasoning Tasks

Paper • 2504.20595 • Published Apr 29, 2025 • 54

Yiping Wang

AI & ML interests

Recent Activity

Organizations

ypwang61's activity

OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models

Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models