Category Error
#3
by
CO-IR
- opened
Regarding the benchmark metrics, HLE(humanity last exam) should belong to higher-order reasoning, not the code domain.
Fixed.
Regarding the benchmark metrics, HLE(humanity last exam) should belong to higher-order reasoning, not the code domain.
Fixed.