Cheng Qian's picture

13

Cheng Qian

chengq9

·

https://qiancheng0.github.io

qiancheng0

AI & ML interests

Agent, Tool Learning

Recent Activity

upvoted a paper about 1 month ago

MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning

upvoted a paper about 1 month ago

ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind

upvoted a paper about 1 month ago

Time-R1: Towards Comprehensive Temporal Reasoning in LLMs

View all activity

Organizations

upvoted 3 papers about 1 month ago

MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning

Paper • 2505.24846 • Published May 30 • 15

ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind

Paper • 2505.22961 • Published May 29 • 8

Time-R1: Towards Comprehensive Temporal Reasoning in LLMs

Paper • 2505.13508 • Published May 16 • 14

upvoted a collection about 2 months ago

RM-R1

RM-R1: Reward Modeling as Reasoning • 16 items • Updated 5 days ago • 8

upvoted a paper about 2 months ago

RM-R1: Reward Modeling as Reasoning

Paper • 2505.02387 • Published May 5 • 77

upvoted a collection 2 months ago

Qwen3

72 items • Updated 19 days ago • 824

authored 11 papers 2 months ago

Tool Learning with Foundation Models

Paper • 2304.08354 • Published Apr 17, 2023 • 3

CREATOR: Disentangling Abstract and Concrete Reasonings of Large Language Models through Tool Creation

Paper • 2305.14318 • Published May 23, 2023

Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents

Paper • 2402.09205 • Published Feb 14, 2024

Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance

Paper • 2410.12361 • Published Oct 16, 2024

EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents

Paper • 2502.09560 • Published Feb 13 • 36

MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents

Paper • 2503.01935 • Published Mar 3 • 27

SMART: Self-Aware Agent for Tool Overuse Mitigation

Paper • 2502.11435 • Published Feb 17

AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset

Paper • 2504.03612 • Published Apr 4 • 2

The Law of Knowledge Overshadowing: Towards Understanding, Predicting, and Preventing LLM Hallucination

Paper • 2502.16143 • Published Feb 22 • 1

ToolRL: Reward is All Tool Learning Needs

Paper • 2504.13958 • Published Apr 16 • 44

OTC: Optimal Tool Calls via Reinforcement Learning

Paper • 2504.14870 • Published Apr 21 • 33

upvoted a paper 2 months ago

Can a Single Model Master Both Multi-turn Conversations and Tool Use? CALM: A Unified Conversational Agentic Language Model

Paper • 2502.08820 • Published Feb 12 • 5

upvoted a collection 2 months ago

ToolRL

The ToolRL model trained for tool use through GRPO • 3 items • Updated Apr 22 • 2

updated a collection 2 months ago

ToolRL

The ToolRL model trained for tool use through GRPO • 3 items • Updated Apr 22 • 2