view article Article ScreenSuite - The most comprehensive evaluation suite for GUI Agents! 9 days ago • 38
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper • 2412.06559 • Published Dec 9, 2024 • 83
view article Article MCP is at a Tipping Point: Here's Why You Should Care By fdaudens • 5 days ago • 14
view article Article The Environmental Impacts of AI -- Primer By sasha and 2 others • Sep 3, 2024 • 39
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics Paper • 2506.01844 • Published 12 days ago • 94
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data By danaaubakirova and 8 others • 12 days ago • 134
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated May 5 • 267
view article Article TinyAgents: A Minimal Experiment with Code Agents and MCP Tools By albertvillanova • 30 days ago • 29
view article Article Introducing smolagents: simple agents that write actions in code. By m-ric and 2 others • Dec 31, 2024 • 1.06k
view article Article Good answers are not necessarily factual answers: an analysis of hallucination in leading LLMs By davidberenstein1957 and 1 other • May 7 • 35