EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL Paper • 2605.18703 • Published 3 days ago • 36
The Granularity Axis: A Micro-to-Macro Latent Direction for Social Roles in Language Models Paper • 2605.06196 • Published 14 days ago • 7
The Granularity Axis: A Micro-to-Macro Latent Direction for Social Roles in Language Models Paper • 2605.06196 • Published 14 days ago • 7
From Context to Skills: Can Language Models Learn from Context Skillfully? Paper • 2604.27660 • Published 18 days ago • 157
SAVOIR: Learning Social Savoir-Faire via Shapley-based Reward Attribution Paper • 2604.18982 • Published 30 days ago • 4
SAVOIR: Learning Social Savoir-Faire via Shapley-based Reward Attribution Paper • 2604.18982 • Published 30 days ago • 4
SAVOIR: Learning Social Savoir-Faire via Shapley-based Reward Attribution Paper • 2604.18982 • Published 30 days ago • 4
Stratagem: Learning Transferable Reasoning via Trajectory-Modulated Game Self-Play Paper • 2604.17696 • Published about 1 month ago • 6
Stratagem: Learning Transferable Reasoning via Trajectory-Modulated Game Self-Play Paper • 2604.17696 • Published about 1 month ago • 6
Stratagem: Learning Transferable Reasoning via Trajectory-Modulated Game Self-Play Paper • 2604.17696 • Published about 1 month ago • 6
GlobeSumm: A Challenging Benchmark Towards Unifying Multi-lingual, Cross-lingual and Multi-document News Summarization Paper • 2410.04087 • Published Oct 5, 2024
Causal Tracing of Object Representations in Large Vision Language Models: Mechanistic Interpretability and Hallucination Mitigation Paper • 2511.05923 • Published Nov 8, 2025
Fine-Mem: Fine-Grained Feedback Alignment for Long-Horizon Memory Management Paper • 2601.08435 • Published Jan 13
ImplicitMemBench: Measuring Unconscious Behavioral Adaptation in Large Language Models Paper • 2604.08064 • Published Apr 9 • 8
ImplicitMemBench: Measuring Unconscious Behavioral Adaptation in Large Language Models Paper • 2604.08064 • Published Apr 9 • 8
CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models Paper • 2602.17684 • Published Feb 4 • 22