Lessons from the Trenches on Reproducible Evaluation of Language Models Paper • 2405.14782 • Published May 23, 2024