Bowen's picture

5 2

Bowen

PeterJinGo

·

AI & ML interests

None yet

Recent Activity

updated a dataset 1 day ago

Archive-models/musique

published a dataset 1 day ago

Archive-models/musique

updated a dataset 1 day ago

Archive-models/nq_hotpotqa_train_search_sample10

View all activity

Organizations

PeterJinGo's activity

upvoted a collection 3 days ago

Search-R1-v0.3

RL with outcome reward + format reward. https://arxiv.org/abs/2505.15117 • 11 items • Updated 1 day ago • 1

upvoted a paper 19 days ago

RM-R1: Reward Modeling as Reasoning

Paper • 2505.02387 • Published 20 days ago • 70

upvoted 2 collections about 2 months ago

Search-R1-v0.2

Exploration with a more stable RL pipeline with outcome-only reward and scaled-up LLMs. https://arxiv.org/abs/2503.09516 • 25 items • Updated 1 day ago • 3

Search-R1

Preliminary checkpoints with outcome-only RL. • 14 items • Updated Apr 7 • 8

upvoted a paper 2 months ago

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 31