Jinyang Wu

Jinyang23

18 35 2

https://jinyangwu.github.io/

jinyangwu

AI & ML interests

large language models, reasoning, agentic rl

Recent Activity

published a model about 10 hours ago

Jinyang23/mm-agentic-tool-use

authored a paper 7 days ago

SEED: Self-Evolving On-Policy Distillation for Agentic Reinforcement Learning

authored a paper 7 days ago

From Imitation to Discrimination: Toward A Generalized Curriculum Advantage Mechanism Enhancing Cross-Domain Reasoning Tasks

View all activity

Organizations

None yet

upvoted a paper 11 days ago

SEED: Self-Evolving On-Policy Distillation for Agentic Reinforcement Learning

Paper • 2607.14777 • Published 12 days ago • 103

upvoted a paper 28 days ago

TACO: Tool-Augmented Credit Optimization for Agentic Tool Use

Paper • 2606.30251 • Published 29 days ago • 22

upvoted 3 papers about 1 month ago

upvoted 2 papers about 2 months ago

RobotEQ: Transitioning from Passive Intelligence to Active Intelligence in Embodied AI

Paper • 2605.06234 • Published May 7 • 4

Late-Layer Fusion is Enough: Dual-Path Vision Token Routing for Multimodal Large Language Models under Visual Saturation

Paper • 2606.09131 • Published Jun 8 • 3

upvoted 2 papers 2 months ago

Maestro: Reinforcement Learning to Orchestrate Hierarchical Model-Skill Ensembles

Paper • 2605.22177 • Published May 21 • 21

Self-Distilled Agentic Reinforcement Learning

Paper • 2605.15155 • Published May 14 • 118

upvoted a paper 3 months ago

From Skill Text to Skill Structure: The Scheduling-Structural-Logical Representation for Agent Skills

Paper • 2604.24026 • Published Apr 27 • 22

upvoted 3 papers 4 months ago

KnowU-Bench: Towards Interactive, Proactive, and Personalized Mobile Agent Evaluation

Paper • 2604.08455 • Published Apr 9 • 48

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

Paper • 2604.02268 • Published Apr 2 • 103

HomeSafe-Bench: Evaluating Vision-Language Models on Unsafe Action Detection for Embodied Agents in Household Scenarios

Paper • 2603.11975 • Published Mar 12 • 12

upvoted 2 papers 5 months ago

CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models

Paper • 2602.17684 • Published Feb 4 • 22

Query as Anchor: Scenario-Adaptive User Representation via Large Language Model

Paper • 2602.14492 • Published Feb 16 • 18

upvoted 5 papers 6 months ago

MOVA: Towards Scalable and Synchronized Video-Audio Generation

Paper • 2602.08794 • Published Feb 9 • 159

OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions

Paper • 2602.05843 • Published Feb 5 • 61

HER: Human-like Reasoning and Reinforcement Learning for LLM Role-playing

Paper • 2601.21459 • Published Jan 29 • 10

TIDE: Trajectory-based Diagnostic Evaluation of Test-Time Improvement in LLM Agents

Paper • 2602.02196 • Published Feb 2 • 35

SafeGround: Know When to Trust GUI Grounding Models via Uncertainty Calibration

Paper • 2602.02419 • Published Feb 2 • 4

Jinyang Wu

AI & ML interests

Recent Activity

Organizations

Jinyang23's activity