pangpangxuan's picture

98 3

pangpangxuan

pangxuan

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

upvoted a paper 5 days ago

CAR-bench: Evaluating the Consistency and Limit-Awareness of LLM Agents under Real-World Uncertainty

upvoted a paper 8 days ago

MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods

View all activity

Organizations

None yet

upvoted a paper 2 days ago

WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

Paper • 2602.04634 • Published 8 days ago • 91

upvoted a paper 5 days ago

CAR-bench: Evaluating the Consistency and Limit-Awareness of LLM Agents under Real-World Uncertainty

Paper • 2601.22027 • Published 14 days ago • 79

upvoted 3 papers 8 days ago

MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods

Paper • 2601.21821 • Published 14 days ago • 59

TTCS: Test-Time Curriculum Synthesis for Self-Evolving

Paper • 2601.22628 • Published 13 days ago • 34

Kimi K2.5: Visual Agentic Intelligence

Paper • 2602.02276 • Published 10 days ago • 225

upvoted a paper 30 days ago

X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests

Paper • 2601.06953 • Published Jan 11 • 44

upvoted 13 papers about 1 month ago

RL-AWB: Deep Reinforcement Learning for Auto White Balance Correction in Low-Light Night-time Scenes

Paper • 2601.05249 • Published Jan 8 • 46

Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization

Paper • 2601.05432 • Published Jan 8 • 166

The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning

Paper • 2601.06002 • Published Jan 9 • 53

Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards

Paper • 2601.06021 • Published Jan 9 • 45

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published Jan 8 • 225

AT^2PO: Agentic Turn-based Policy Optimization via Tree Search

Paper • 2601.04767 • Published Jan 8 • 28

Benchmark^2: Systematic Evaluation of LLM Benchmarks

Paper • 2601.03986 • Published Jan 7 • 34

Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization

Paper • 2512.24615 • Published Dec 31, 2025 • 119

Evaluating Parameter Efficient Methods for RLVR

Paper • 2512.23165 • Published Dec 29, 2025 • 26

Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

Paper • 2512.24618 • Published Dec 31, 2025 • 150

Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling

Paper • 2512.23959 • Published Dec 30, 2025 • 112

Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space

Paper • 2512.24617 • Published Dec 31, 2025 • 64

Mindscape-Aware Retrieval Augmented Generation for Improved Long Context Understanding

Paper • 2512.17220 • Published Dec 19, 2025 • 112

upvoted a paper about 2 months ago

Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding

Paper • 2512.17532 • Published Dec 19, 2025 • 67