4 24 34

Ning Ding

stingning

https://www.stingning.cn

ningding97

AI & ML interests

NLP

Recent Activity

upvoted a paper 10 days ago

How Far Can Unsupervised RLVR Scale LLM Training?

upvoted a paper about 1 month ago

P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads

liked a model about 1 month ago

openbmb/MiniCPM-o-4_5

View all activity

Organizations

upvoted a paper 10 days ago

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published 11 days ago • 56

upvoted a paper about 1 month ago

P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads

Paper • 2602.09443 • Published Feb 10 • 58

upvoted a paper 4 months ago

P1: Mastering Physics Olympiads with Reinforcement Learning

Paper • 2511.13612 • Published Nov 17, 2025 • 134

upvoted 5 papers 6 months ago

From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones

Paper • 2509.25123 • Published Sep 29, 2025 • 22

HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?

Paper • 2509.07894 • Published Sep 9, 2025 • 31

upvoted 3 papers 7 months ago

Towards a Unified View of Large Language Model Post-Training

Paper • 2509.04419 • Published Sep 4, 2025 • 76

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21, 2025 • 272

SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published Aug 14, 2025 • 97

upvoted a paper 10 months ago

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Paper • 2505.22617 • Published May 28, 2025 • 132

upvoted a paper 11 months ago

TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22, 2025 • 122

upvoted 4 papers about 1 year ago

Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models

Paper • 2503.11224 • Published Mar 14, 2025 • 28

UltraIF: Advancing Instruction Following from the Wild

Paper • 2502.04153 • Published Feb 6, 2025 • 24

Process Reinforcement through Implicit Rewards

Paper • 2502.01456 • Published Feb 3, 2025 • 62

MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding

Paper • 2501.18362 • Published Jan 30, 2025 • 23

upvoted an article about 1 year ago

Article

Process Reinforcement through Implicit Rewards

Jan 3, 2025

•

upvoted a paper about 1 year ago

Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Paper • 2412.17739 • Published Dec 23, 2024 • 41

upvoted a collection over 1 year ago

ImplicitPRM

Collection

3 items • Updated 18 days ago • 5

Ning Ding

AI & ML interests

Recent Activity

Organizations

stingning's activity

Process Reinforcement through Implicit Rewards