Collection of Scaling Law Papers
-
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Paper • 2408.03314 • Published • 61 -
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 10 -
Scaling Laws for Precision
Paper • 2411.04330 • Published • 8 -
Transcending Scaling Laws with 0.1% Extra Compute
Paper • 2210.11399 • Published