Done Is Better than Perfect: Unlocking Efficient Reasoning by Structured Multi-Turn Decomposition Paper • 2505.19788 • Published 21 days ago • 13
Which Data Attributes Stimulate Math and Code Reasoning? An Investigation via Influence Functions Paper • 2505.19949 • Published 21 days ago • 16
AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models Paper • 2406.13233 • Published Jun 19, 2024 • 1