Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters Paper • 2408.03314 • Published Aug 6, 2024 • 56
TAG: A Decentralized Framework for Multi-Agent Hierarchical Reinforcement Learning Paper • 2502.15425 • Published 5 days ago • 7