ai-progress-charts / emma_mini.jsonl
kaizuberbuehler's picture
Add EMMA benchmark
064c980
raw
history blame contribute delete
222 Bytes
{"model": "gemini-2.0-flash-thinking-exp-01-21", "score": 48.00}
{"model": "o1-2024-12-17", "score": 45.75}
{"model": "gemini-2.0-flash-thinking-exp-1219", "score": 43.50}
{"model": "qwen2-vl-72b-instruct", "score": 37.25}