SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper • 2503.11576 • Published Mar 14 • 110
view article Article Gemma 3n fully available in the open-source ecosystem! By ariG23498 and 7 others • 13 days ago • 105
view article Article Building a Real-Time Video Chat with Gemini 2.0, Gradio, and WebRTC 👀👂 By freddyaboulton • Jan 13 • 9
view article Article Introducing the Synthetic Data Generator - Build Datasets with Natural Language By davidberenstein1957 and 5 others • Dec 16, 2024 • 130
view article Article Introducing smolagents: simple agents that write actions in code. By m-ric and 2 others • Dec 31, 2024 • 1.08k