In-the-wild Interactions with Search-LLMs w/ Human Preferences
LMArena
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
Organization Card
LMArena is an open platform for crowdsourced AI benchmarking, originally created by researchers from UC Berkeley SkyLab.
We have officially graduated from LMSYS.org!
Free chat with the best AI models at lmarena.ai, and see rankings at lmarena.ai/leaderboard.
Collections
3
spaces
8
Running
4.44k
Chatbot Arena Leaderboard
🏆
Display chatbot leaderboard and statistics
Running
2
Arena Hard Viewer
⚡
Browse and evaluate model judgments from benchmarks
Running
28
Llama-4-Maverick-03-26-Experimental Battles
🔥
Browse and compare model conversation outcomes
Running
Prompt Freshness
😻
Select similarity and language to filter prompts
Running
9
Category Arena Example
📚
Browse chatbot responses to compare models
Running
7
Preference Proxy Evaluations
🦀
Preference Proxy Evaluations
models
20
lmarena-ai/p2l-7b-grk-01112025
Updated
•
4
•
3
lmarena-ai/p2l-7b-grk-02222025
Updated
•
180
•
6
lmarena-ai/p2l-0.5b-bt-01132025
Updated
•
4
lmarena-ai/p2l-1.5b-bt-01132025
Updated
•
6
lmarena-ai/p2l-3b-bt-01132025
Updated
•
4
lmarena-ai/p2l-7b-bt-01132025
Updated
•
20
•
2
lmarena-ai/p2l-135m-bt-01132025
Updated
•
3
lmarena-ai/p2l-360m-bt-01132025
Updated
•
3
lmarena-ai/p2l-135m-rk-01132025
Updated
•
4
lmarena-ai/p2l-360m-rk-01132025
Updated
•
3
datasets
21
lmarena-ai/search-arena-24k
Viewer
•
Updated
•
24.1k
•
124
•
4
lmarena-ai/arena-hard-auto
Updated
•
431
•
2
lmarena-ai/categories-benchmark-eval
Preview
•
Updated
•
32
•
3
lmarena-ai/search-arena-v1-7k
Viewer
•
Updated
•
7k
•
544
•
17
lmarena-ai/webdev-arena-preference-10k
Viewer
•
Updated
•
10.5k
•
271
•
8
lmarena-ai/repochat-arena-preference-4k
Viewer
•
Updated
•
3.84k
•
78
•
4
lmarena-ai/arena-human-preference-100k
Viewer
•
Updated
•
106k
•
1.03k
•
46
lmarena-ai/VisionArena-Chat
Viewer
•
Updated
•
199k
•
3.17k
•
5
lmarena-ai/VisionArena-Battle
Viewer
•
Updated
•
29.8k
•
264
•
6
lmarena-ai/vision-arena-bench-v0.1
Viewer
•
Updated
•
500
•
1.51k
•
2