Tony Wu's picture

Tony Wu

tonywu71

AI & ML interests

LLM, Multimodal, Agents, Information Retrieval, RAG, Speech

Recent Activity

Organizations

Blog-explorers's profile picture Illuin Technology - Vidore's profile picture PDFPages's profile picture smol-explorers's profile picture

tonywu71's activity

upvoted an article 17 days ago
view article
Article

nanoVLM: The simplest repository to train your VLM in pure PyTorch

By ariG23498 and 6 others
140
upvoted an article 19 days ago
view article
Article

Preference Optimization for Vision Language Models

By qgallouedec and 3 others
76
upvoted an article 26 days ago
view article
Article

Vision Language Models (Better, Faster, Stronger)

By merve and 4 others
420
upvoted an article about 2 months ago
upvoted an article 2 months ago
view article
Article

ViDoRe Benchmark V2: Raising the Bar for Visual Retrieval

By manu and 2 others
10
upvoted an article 3 months ago
view article
Article

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

By mlabonne
329
upvoted 2 articles 4 months ago
view article
Article

SigLIP 2: A better multilingual vision language encoder

By ariG23498 and 2 others
165
view article
Article

PaliGemma 2 Mix - New Instruction Vision Language Models by Google

By ariG23498 and 2 others
70
upvoted 2 articles 4 months ago
view article
Article

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

By danaaubakirova and 3 others
158
view article
Article

Open-source DeepResearch – Freeing our search agents

By m-ric and 4 others
1.25k
upvoted an article 5 months ago
view article
Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

By andito and 2 others
180
upvoted an article 7 months ago
view article
Article

Visually Multilingual: Introducing mcdse-2b

By marco
41