Clémentine Fourrier
clefourrier
AI & ML interests
None yet
Recent Activity
updated
a dataset
about 15 hours ago
gaia-benchmark/results_public
updated
a dataset
about 15 hours ago
gaia-benchmark/submissions_public
new activity
3 days ago
open-llm-leaderboard/open_llm_leaderboard:models not being evaluated ?
Organizations
clefourrier's activity
models not being evaluated ?
6
#1114 opened 13 days ago
by
LeroyDyer

Can not submit result to GAIA leaderboard
7
#37 opened 3 days ago
by
MengkangHu
Where are the datasets?
2
#14 opened 6 days ago
by
fatihozturk
Suggestion
3
#1073 opened about 2 months ago
by
zelk12
Feature suggestion: average of selected (rather than all) columns
5
#368 opened over 1 year ago
by
Minus0
Can we add model layers?
1
#1098 opened 26 days ago
by
Blazgo

Model Evals slow?
#1116 opened 11 days ago
by
Xiaojian9992024

Need help doing eval- they keep failing
1
#1083 opened about 1 month ago
by
SicariusSicariiStuff

Batch size 'auto' leads to hanging jobs
1
#1110 opened 18 days ago
by
gcamp

I can't replicate results.
5
#1016 opened 4 months ago
by
Pretergeek

The model requires `trust_remote_code=True` to launch, and for safety reasons, we don't accept such models automatically.
1
#1119 opened 10 days ago
by
NikolaSigmoid
Suggestion: Add Agentic Function Calling Benchmark such as BFCL v3
2
#1118 opened 11 days ago
by
ejschwartz
`trust_remote_code=True` when submit finetune model
1
#1123 opened 7 days ago
by
trthminh
Problem evaluating 72B, please help
6
#1117 opened 11 days ago
by
Marsouuu

Runtime error
#34 opened 7 days ago
by
tunglinwu
Using LLM judge for eval instead of strict equality
1
#1 opened 11 days ago
by
manojbajaj95

Run time error on leaderboard spaces
1
#33 opened 11 days ago
by
jouskaxin
Cannot create the endpoint
4
#22 opened 13 days ago
by
OG3850