view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency By not-lain • Jan 30 • 81
view article Article Vision Language Models (Better, Faster, Stronger) By merve and 4 others • May 12 • 446
view article Article 17 Reasons Why Gradio Isn't Just Another UI Library By ysharma and 1 other • Apr 16 • 40
view article Article Training and Finetuning Reranker Models with Sentence Transformers v4 By tomaarsen • Mar 26 • 138
view article Article Train 400x faster Static Embedding Models with Sentence Transformers By tomaarsen • Jan 15 • 190
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM By ariG23498 and 3 others • Mar 12 • 435
view article Article LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on your Phone! By medmekk and 1 other • Mar 7 • 65
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data By danaaubakirova and 8 others • 18 days ago • 150
view article Article SmolVLM - small yet mighty Vision Language Model By andito and 4 others • Nov 26, 2024 • 319
view article Article Introducing smolagents: simple agents that write actions in code. By m-ric and 2 others • Dec 31, 2024 • 1.07k
view article Article FastRTC: The Real-Time Communication Library for Python By freddyaboulton and 1 other • Feb 25 • 167
view article Article Open-R1: a fully open reproduction of DeepSeek-R1 By eliebak and 2 others • Jan 28 • 868
view article Article Visualize and understand GPU memory in PyTorch By qgallouedec • Dec 24, 2024 • 226
view article Article You could have designed state of the art positional encoding By FL33TW00D-HF • Nov 25, 2024 • 302
view article Article A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality By saurabhdash and 3 others • Mar 4 • 75