Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
OpenEvals 's Collections
YourBench
Archived Open LLM Leaderboard (2024-2025)
Research collaborations
Leaderboards related tools
Archived Open LLM Leaderboard (2023-2024)

Archived Open LLM Leaderboard (2024-2025)

updated Apr 2

This leaderboard has been evaluating LLMs from Jun 2024 on IFEval, MuSR, GPQA, MATH, BBH and MMLU-Pro

Upvote
-

  • Running
    119
    119

    Open-LLM performances are plateauing, let’s make the leaderboard steep again

    πŸ”

    Update leaderboard for fair model evaluation

    Note Blog on why we made a new version of the Open LLM Leaderboard


  • Running on CPU Upgrade
    13.1k
    13.1k

    Open LLM Leaderboard

    πŸ†

    Track, rank and evaluate open LLMs and chatbots

    Note The actual leaderboard! With a stylish new ux :)


  • open-llm-leaderboard/contents

    Viewer β€’ Updated Mar 20 β€’ 4.58k β€’ 7.94k β€’ 16

    Note If you want to download the main leaderboard table, you'll find the dataset here!


  • open-llm-leaderboard/results

    Preview β€’ Updated Mar 15 β€’ 8.65k β€’ 12

    Note To extract more detailed aggregated results for each model, look here!


  • open-llm-leaderboard/requests

    Preview β€’ Updated Mar 17 β€’ 18.1k β€’ 10

    Note All models ever submitted to the leaderboard


  • Running on CPU Upgrade
    92
    92

    Open LLM Leaderboard Model Comparator

    πŸ†

    Compare Open LLM Leaderboard results

Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs