From Raw Experience to Skill Consumption: A Systematic Study of Model-Generated Agent Skills Paper • 2605.23899 • Published 11 days ago • 29
SkillOpt: Executive Strategy for Self-Evolving Agent Skills Paper • 2605.23904 • Published 11 days ago • 211
LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling Paper • 2605.08083 • Published 25 days ago • 69
Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges Paper • 2604.13602 • Published Apr 15 • 32
BizGenEval: A Systematic Benchmark for Commercial Visual Content Generation Paper • 2603.25732 • Published Mar 26 • 11
RubricBench: Aligning Model-Generated Rubrics with Human Standards Paper • 2603.01562 • Published Mar 2 • 63
CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation Paper • 2602.24286 • Published Feb 27 • 99
MOVA: Towards Scalable and Synchronized Video-Audio Generation Paper • 2602.08794 • Published Feb 9 • 159
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation Paper • 2602.03796 • Published Feb 3 • 64
Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models Paper • 2602.02185 • Published Feb 2 • 118
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published Jan 30 • 228
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models Paper • 2601.22060 • Published Jan 29 • 155
Green-VLA: Staged Vision-Language-Action Model for Generalist Robots Paper • 2602.00919 • Published Jan 31 • 325
BatCoder: Self-Supervised Bidirectional Code-Documentation Learning via Back-Translation Paper • 2602.02554 • Published Jan 30 • 8
Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing Paper • 2602.03845 • Published Feb 3 • 27
Learning Query-Specific Rubrics from Human Preferences for DeepResearch Report Generation Paper • 2602.03619 • Published Feb 3 • 28
Aligning Large Language Models with Human Preferences through Representation Engineering Paper • 2312.15997 • Published Dec 26, 2023 • 2
Controllable Memory Usage: Balancing Anchoring and Innovation in Long-Term Human-Agent Interaction Paper • 2601.05107 • Published Jan 8 • 24