Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
MathCritique
community
https://mathcritique.github.io/
Activity Feed
Follow
4
AI & ML interests
LLM Reasoning, Critique
Recent Activity
WooooDyy
authored
a paper
5 days ago
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use
WooooDyy
updated
a dataset
about 2 months ago
MathCritique/MathCritique-76k
WooooDyy
authored
a paper
3 months ago
TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models
View all activity
Team members
1
models
None public yet
datasets
1
MathCritique/MathCritique-76k
Updated
Nov 25, 2024
•
10
•
8