benchmark-evaluation allenai/ai2_arc Viewer β’ Updated Dec 21, 2023 β’ 7.79k β’ 212k β’ 213 Rowan/hellaswag Viewer β’ Updated Jul 10 β’ 60k β’ 177k β’ 137 ybisk/piqa Updated Jan 18, 2024 β’ 69.7k β’ 96 EleutherAI/lambada_openai Viewer β’ Updated Jul 10 β’ 30.9k β’ 59k β’ 42
benchmark-evaluation allenai/ai2_arc Viewer β’ Updated Dec 21, 2023 β’ 7.79k β’ 212k β’ 213 Rowan/hellaswag Viewer β’ Updated Jul 10 β’ 60k β’ 177k β’ 137 ybisk/piqa Updated Jan 18, 2024 β’ 69.7k β’ 96 EleutherAI/lambada_openai Viewer β’ Updated Jul 10 β’ 30.9k β’ 59k β’ 42