Taming LLMs by Scaling Learning Rates with Gradient Grouping Paper • 2506.01049 • Published 7 days ago • 36