leaderboards - a MoritzLaurer Collection

MoritzLaurer 's Collections

prompt-templates

Zeroshot Classifiers

other-interesting

code generation

leaderboards

updated Apr 2

Running

4.45k

4.45k

Chatbot Arena Leaderboard

🏆

Display chatbot arena leaderboard and statistics
Running on CPU Upgrade

13.1k

13.1k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots
Running on CPU Upgrade

5.8k

5.8k

MTEB Leaderboard

🥇

Embedding Leaderboard
Running on CPU Upgrade

855

855

Open ASR Leaderboard

🏆

Request evaluation for new speech models
Running

504

504

LLM-Perf Leaderboard

🏆

Explore LLM performance across hardware
Running

1.33k

1.33k

Big Code Models Leaderboard

📈

Submit code models for evaluation on benchmarks
Runtime error

78

78

Human & GPT-4 Evaluation of LLMs Leaderboard

👩
Running

442

442

Can Ai Code Results

🏆

Can AI Code? An LLM leaderboard inclquantized models.
Runtime error

140

140

Hallucinations Leaderboard

🔥

View and submit LLM evaluations
Runtime error

105

105

Enterprise Scenarios Leaderboard

🥇
Running on CPU Upgrade

92

92

LLM Safety Leaderboard

🥇

View and submit machine learning model evaluations
Running

549

549

Vision Arena (Testing VLMs side-by-side)

🖼

Analyze images to detect and label objects
Running

66

66

CyberSecEvalTest

📈

Evaluate LLM cybersecurity risks
Running

330

330

LLM Performance Leaderboard

🐨

View LLM Performance Leaderboard
Running on CPU Upgrade

72

72

AIR-Bench Leaderboard

🥇

Explore benchmark results for QA and long doc models
Running on CPU Upgrade

782

782

Open VLM Leaderboard

🌎

VLMEvalKit Evaluation Results Collection
Running

376

376

Reward Bench Leaderboard

📐

Display and filter reward model evaluation data
Running

210

210

BigCodeBench Leaderboard

🥇

Explore and analyze code evaluation data
Running

10

10

MJ Bench Leaderboard

🥇

Display and filter multimodal model leaderboard results
Running

110

110

MTEB Arena

⚔

Launch MTEB Arena to compare models
Runtime error

152

152

Open LLM Progress Tracker

🔬

Visualize Open vs. Proprietary LLM Progress
Running

103

103

Judge Arena

💻

Vote on AI responses to rank models
Running on Zero

377

377

TTS Spaces Arena

🤗

Blind vote on HF TTS models!
Running

136

136

smolagents LLM leaderboard

🏆

A leaderboard for LLMs powering smolagents