FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper • 2506.20920 • Published 15 days ago • 60
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text Paper • 2506.05209 • Published Jun 5 • 42
Revisiting LRP: Positional Attribution as the Missing Ingredient for Transformer Explainability Paper • 2506.02138 • Published Jun 2 • 1
view post Post 3128 SmolVLM is now available on PocketPal — you can run it offline on your smartphone to interpret the world around you. 🌍📱And check out this real-time camera demo by @ngxson , powered by llama.cpp:https://github.com/ngxson/smolvlm-realtime-webcamhttps://x.com/pocketpal_ai See translation 3 replies · ❤️ 11 11 😎 1 1 + Reply
Describe Anything: Detailed Localized Image and Video Captioning Paper • 2504.16072 • Published Apr 22 • 61
4D-Bench: Benchmarking Multi-modal Large Language Models for 4D Object Understanding Paper • 2503.17827 • Published Mar 22 • 8
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published Mar 12 • 72
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 235
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 235
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 235
SliderSpace: Decomposing the Visual Capabilities of Diffusion Models Paper • 2502.01639 • Published Feb 3 • 26