1 13

Hongru Wang

Merlin-Hongru

https://rulegreen.github.io/

AI & ML interests

None yet

Recent Activity

upvoted a paper 5 days ago

MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning

upvoted a paper 9 days ago

ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind

upvoted a paper 11 days ago

Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution

View all activity

Organizations

Merlin-Hongru's activity

upvoted a paper 5 days ago

MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning

Paper • 2505.24846 • Published 8 days ago • 15

upvoted a paper 9 days ago

ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind

Paper • 2505.22961 • Published 10 days ago • 8

upvoted a paper 11 days ago

Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution

Paper • 2505.20286 • Published 12 days ago • 6

authored a paper 11 days ago

Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution

Paper • 2505.20286 • Published 12 days ago • 6

upvoted a paper 12 days ago

AdaCtrl: Towards Adaptive and Controllable Reasoning via Difficulty-Aware Budgeting

Paper • 2505.18822 • Published 14 days ago • 14

authored a paper 12 days ago

AdaCtrl: Towards Adaptive and Controllable Reasoning via Difficulty-Aware Budgeting

Paper • 2505.18822 • Published 14 days ago • 14

upvoted a paper 14 days ago

Time-R1: Towards Comprehensive Temporal Reasoning in LLMs

Paper • 2505.13508 • Published 23 days ago • 14

upvoted a paper 25 days ago

Seed1.5-VL Technical Report

Paper • 2505.07062 • Published 27 days ago • 143

authored a paper about 1 month ago

RM-R1: Reward Modeling as Reasoning

Paper • 2505.02387 • Published May 5 • 76

upvoted 2 papers about 1 month ago

Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL

Paper • 2505.02391 • Published May 5 • 24

RM-R1: Reward Modeling as Reasoning

Paper • 2505.02387 • Published May 5 • 76

authored 2 papers about 1 month ago

SMART: Self-Aware Agent for Tool Overuse Mitigation

Paper • 2502.11435 • Published Feb 17

OTC: Optimal Tool Calls via Reinforcement Learning

Paper • 2504.14870 • Published Apr 21 • 33

commented a paper about 2 months ago

OTC: Optimal Tool Calls via Reinforcement Learning

Paper • 2504.14870 • Published Apr 21 • 33 •

authored a paper about 2 months ago

ToolRL: Reward is All Tool Learning Needs

Paper • 2504.13958 • Published Apr 16 • 44

upvoted 2 papers about 2 months ago

OTC: Optimal Tool Calls via Reinforcement Learning

Paper • 2504.14870 • Published Apr 21 • 33

ToolRL: Reward is All Tool Learning Needs

Paper • 2504.13958 • Published Apr 16 • 44

authored a paper 2 months ago

Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models

Paper • 2503.24377 • Published Mar 31 • 17

authored 2 papers 7 months ago

Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering

Paper • 2410.15999 • Published Oct 21, 2024 • 20

Analysing the Residual Stream of Language Models Under Knowledge Conflicts

Paper • 2410.16090 • Published Oct 21, 2024 • 7