view article Article We're open-sourcing "The Amazing Hand", a fully 3D printed robotic hand for less than $200 ✌️✌️✌️ By pollen-robotics and 2 others • 1 day ago • 15
view article Article FineWeb-C: A Community-Driven Dataset for Educational Quality Annotations in 122 Languages By davanstrien and 5 others • 1 day ago • 21
view article Article LLM Hallucinations: bug or feature? The US Supreme Court 2025 cases experiment By dvilasuero • about 24 hours ago • 15
Training data for Swedish Lion Libre Collection This collection groups together the publically available training data used in creating our set of models for HTR: Swedish Lion Libre. • 11 items • Updated Jan 14 • 1
view article Article Training and Finetuning Sparse Embedding Models with Sentence Transformers v5 By tomaarsen and 1 other • 8 days ago • 84
view article Article Teaching Data Literacy with Hugging Face's AI Sheets By ParulPandey • 9 days ago • 23
view article Article 🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation By moonshotai and 1 other • 18 days ago • 57
view article Article No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL By toslali-ibm and 5 others • Jun 3 • 66
Institutional Books 1.0: A 242B token dataset from Harvard Library's collections, refined for accuracy and usability Paper • 2506.08300 • Published 29 days ago • 8
MiniMax-M1 Collection MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. • 6 items • Updated 6 days ago • 106
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text Paper • 2506.05209 • Published Jun 5 • 42
view article Article Tiny Agents in Python: a MCP-powered agent in ~70 lines of code By celinah and 3 others • May 23 • 143
Common Pile v0.1 Raw Data Collection 8TB of public domain and openly licensed text • 30 items • Updated Jun 6 • 14
Common Pile v0.1 Collection All resources related to Common Pile v0.1, an 8TB dataset of public domain and openly licensed text • 4 items • Updated Jun 6 • 26