HardTests: Synthesizing High-Quality Test Cases for LLM Coding Paper • 2505.24098 • Published 8 days ago • 41
view article Article Falcon-Arabic: A Breakthrough in Arabic Language Models By tiiuae and 7 others • 17 days ago • 30
Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval Paper • 2505.16967 • Published 15 days ago • 22
Running 90 90 MiniMax Speech Tech Report 🎙 Generate high-quality speech from text with MiniMax-Speech
U-MATH and μ-MATH - University-level math evaluation Collection Paper: A UNIVERSITY-LEVEL BENCHMARK FOR EVALUATING MATHEMATICAL SKILLS IN LLMS • 4 items • Updated Jan 14 • 17
SARD: Synthetic Arabic Recognition Dataset Collection A large-scale synthetic Arabic OCR dataset comprising 843,622 book-style document images across 10 fonts, designed to advance VLM for Arabic Texts • 2 items • Updated 19 days ago • 3
view article Article LeRobot Community Datasets: The “ImageNet” of Robotics — When and How? By danaaubakirova and 6 others • 27 days ago • 57