CompVis Community

university

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

penfever authored a paper 6 days ago

When Do Neural Nets Outperform Boosted Trees on Tabular Data?

penfever authored a paper 6 days ago

MARVIS: Modality Adaptive Reasoning over VISualizations

loubnabnl authored a paper about 1 month ago

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

View all activity

lvwerra

authored a paper 13 days ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published 15 days ago • 60

Skylion007

authored a paper 23 days ago

The Diffusion Duality

Paper • 2506.10892 • Published 28 days ago • 37

loubnabnl

authored a paper about 1 month ago

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

Paper • 2506.05209 • Published Jun 5 • 42

liorwolf

authored a paper about 1 month ago

Revisiting LRP: Positional Attribution as the Missing Ingredient for Transformer Explainability

Paper • 2506.02138 • Published Jun 2 • 1

loubnabnl

posted an update about 2 months ago

Post

3128

SmolVLM is now available on PocketPal — you can run it offline on your smartphone to interpret the world around you. 🌍📱

And check out this real-time camera demo by @ngxson , powered by llama.cpp:
https://github.com/ngxson/smolvlm-realtime-webcam
https://x.com/pocketpal_ai

3 replies

·

zwrq

authored a paper 3 months ago

Describe Anything: Detailed Localized Image and Video Captioning

Paper • 2504.16072 • Published Apr 22 • 61

anton-l

authored a paper 3 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 192

pcuenq

authored a paper 3 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 192

lvwerra

authored a paper 3 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 192

loubnabnl

authored a paper 3 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 192

ajhamdi

authored a paper 3 months ago

4D-Bench: Benchmarking Multi-modal Large Language Models for 4D Object Understanding

Paper • 2503.17827 • Published Mar 22 • 8

aswerdlow

authored 2 papers 3 months ago

Unifying 2D and 3D Vision-Language Understanding

Paper • 2503.10745 • Published Mar 13

Unified Multimodal Discrete Diffusion

Paper • 2503.20853 • Published Mar 26 • 9

Skylion007

authored a paper 4 months ago

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published Mar 12 • 72

RafailFridman

authored a paper 5 months ago

DynVFX: Augmenting Real Videos with Dynamic Content

Paper • 2502.03621 • Published Feb 5 • 30

DanahY

authored a paper 5 months ago

DynVFX: Augmenting Real Videos with Dynamic Content

Paper • 2502.03621 • Published Feb 5 • 30

loubnabnl

authored a paper 5 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 235

anton-l

authored a paper 5 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 235

lvwerra

authored a paper 5 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 235

ZongzeWu

authored a paper 5 months ago

SliderSpace: Decomposing the Visual Capabilities of Diffusion Models

Paper • 2502.01639 • Published Feb 3 • 26