Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
52
EvalEval Bot
EvalEvalBot
Follow
evijit's profile picture
1 follower
·
2 following
AI & ML interests
None yet
Recent Activity
new
activity
about 16 hours ago
evaleval/EEE_datastore:
[Submission] HAL Leaderboard — 9 agentic benchmarks (246 entries)
new
activity
about 20 hours ago
evaleval/EEE_datastore:
Repair HF PR #26 alphaXiv data to strict schema and canonical identity
new
activity
about 22 hours ago
evaleval/EEE_datastore:
[ACL Shared Task] Add LingOly benchmark results
View all activity
Organizations
EvalEvalBot
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
New activity in
evaleval/EEE_datastore
about 16 hours ago
[Submission] HAL Leaderboard — 9 agentic benchmarks (246 entries)
1
#80 opened about 16 hours ago by
Asaf-Yehudai
New activity in
evaleval/EEE_datastore
about 20 hours ago
Repair HF PR #26 alphaXiv data to strict schema and canonical identity
2
#79 opened about 21 hours ago by
yananlong
New activity in
evaleval/EEE_datastore
about 22 hours ago
[ACL Shared Task] Add LingOly benchmark results
2
#78 opened about 22 hours ago by
ambean
New activity in
evaleval/EEE_datastore
about 23 hours ago
Restore missing HF PR #57 entries that did not land in PR #74
2
#76 opened 1 day ago by
yananlong
updated
a dataset
1 day ago
evaleval/EEE_datastore
Viewer
•
Updated
about 21 hours ago
•
11.5k
•
3.12k
•
19
New activity in
evaleval/EEE_datastore
1 day ago
Add HELM AIR-Bench v1.19.0 results
5
#70 opened 9 days ago by
yifanmai
[ACL Shared Task] Add PACEBench evaluation results
1
#77 opened 1 day ago by
mrpfisher
New activity in
evaleval/EEE_datastore
2 days ago
Normalize schema versions to 0.2.2 and backfill canonical identity
🚀
2
6
#74 opened 3 days ago by
yananlong
[ACL Shared Task] Add CocoaBench aggregate results
1
#75 opened 2 days ago by
Cerru02
New activity in
evaleval/EEE_datastore
4 days ago
[ACL Shared Task] Add Multi-SWE-Bench and SWE-PolyBench leaderboard data
4
#72 opened 4 days ago by
jatinganhotra
New activity in
evaleval/EEE_datastore
7 days ago
Add alphaXiv SOTA evaluations (27,976 records, 1,646 benchmarks)
10
#26 opened 2 months ago by
simpod
Add AlpacaEval 1.0 and 2.0 leaderboard data (324 models)
7
#65 opened 9 days ago by
karthikchundi
New activity in
evaleval/EEE_datastore
8 days ago
[Submission] Fix win_rate scale (0-1) and merge Fibble variants into composite benchmark
1
#71 opened 8 days ago by
drchangliu
New activity in
evaleval/EEE_datastore
9 days ago
[ACL Shared Task] Add AlpacaEval 1.0 and 2.0 leaderboard data (324 models)
1
#69 opened 9 days ago by
karthikchundi
[ACL Shared Task] Add SWE-bench Verified official leaderboard data
11
#63 opened 11 days ago by
jatinganhotra
[ACL Shared Task] Add BountyBench (DetectWorkflow) evaluation results
1
#67 opened 9 days ago by
mrpfisher
New activity in
evaleval/EEE_datastore
10 days ago
Add HELM Capabilities v1.15.0 results
1
#64 opened 10 days ago by
yifanmai
New activity in
evaleval/EEE_datastore
13 days ago
[ACL Shared Task] Add Artificial Analysis LLM results
2
#62 opened 13 days ago by
Cerru02
New activity in
evaleval/EEE_datastore
15 days ago
[ACL Shared Task] Add Arcadia Impact Inspect evaluation results
🚀
2
10
#57 opened 16 days ago by
mrpfisher
New activity in
evaleval/EEE_datastore
16 days ago
Parquet for dataset viewer
#59 opened 16 days ago by
EvalEvalBot
Load more