ai-progress-charts / gpqa_leaderboard.jsonl
kaizuberbuehler's picture
Add new benchmarks; Several improvements
afb8d0c
raw
history blame contribute delete
349 Bytes
{"model": "o1-2024-12-17", "score": 76}
{"model": "claude-3-5-sonnet-20240620", "score": 56}
{"model": "gpt-4o-2024-05-13", "score": 49}
{"model": "claude-3-opus-20240229", "score": 48}
{"model": "gemini-1.5-pro-001", "score": 45}
{"model": "gpt-4-1106-preview", "score": 43}
{"model": "claude-2.0", "score": 35}
{"model": "gpt-4-0613", "score": 33}