64 30 43

Johannes Kolbe

johko

johko

AI & ML interests

None yet

Recent Activity

upvoted an article 5 days ago

Introducing Modular Diffusers - Composable Building Blocks for Diffusion Pipelines

published a Space about 1 month ago

johko/kreuzberg-rag

published a Space 5 months ago

johko/computer-vision-quiz

View all activity

Organizations

upvoted an article 5 days ago

Article

Introducing Modular Diffusers - Composable Building Blocks for Diffusion Pipelines

9 days ago

•

upvoted an article 8 months ago

Article

FineWeb-C: A Community-Driven Dataset for Educational Quality Annotations in 122 Languages

Jul 8, 2025

•

upvoted a paper 11 months ago

MIEB: Massive Image Embedding Benchmark

Paper • 2504.10471 • Published Apr 14, 2025 • 21

upvoted an article about 1 year ago

Article

State of open video generation models in Diffusers

Jan 27, 2025

•

upvoted an article over 1 year ago

Article

Recoloring photos with diffusers

Oct 9, 2024

•

upvoted a paper over 1 year ago

HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal Reasoning

Paper • 2407.15680 • Published Jul 22, 2024 • 1

upvoted an article over 1 year ago

Article

The 5 Most Under-Rated Tools on Hugging Face

Aug 22, 2024

•

upvoted 2 papers over 1 year ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 133

LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models

Paper • 2407.07895 • Published Jul 10, 2024 • 42

upvoted an article almost 2 years ago

Article

Design choices for Vision Language Models in 2024

Apr 16, 2024

•

upvoted a collection almost 2 years ago

🎭 Avatars

Collection

The latest AI-powered technologies usher in a new era of realistic avatars! 🚀 • 75 items • Updated Apr 20, 2025 • 93

upvoted a paper almost 2 years ago

FeatUp: A Model-Agnostic Framework for Features at Any Resolution

Paper • 2403.10516 • Published Mar 15, 2024 • 16

upvoted a collection about 2 years ago

Matryoshka Embedding Models

Collection

https://huggingface.co/blog/matryoshka • 14 items • Updated May 13, 2025 • 16

upvoted 3 papers about 2 years ago

How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts

Paper • 2402.13220 • Published Feb 20, 2024 • 14

Aria Everyday Activities Dataset

Paper • 2402.13349 • Published Feb 20, 2024 • 31

PokéLLMon: A Human-Parity Agent for Pokémon Battles with Large Language Models

Paper • 2402.01118 • Published Feb 2, 2024 • 32

upvoted a collection about 2 years ago

AIM

Collection

AIM: Autoregressive Image Models • 5 items • Updated Aug 25, 2025 • 50

upvoted a paper about 2 years ago

Instruct-Imagen: Image Generation with Multi-modal Instruction

Paper • 2401.01952 • Published Jan 3, 2024 • 32

upvoted 2 papers over 2 years ago

PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding

Paper • 2312.04461 • Published Dec 7, 2023 • 62

Describing Differences in Image Sets with Natural Language

Paper • 2312.02974 • Published Dec 5, 2023 • 15

Johannes Kolbe

AI & ML interests

Recent Activity

Organizations

johko's activity

Introducing Modular Diffusers - Composable Building Blocks for Diffusion Pipelines

FineWeb-C: A Community-Driven Dataset for Educational Quality Annotations in 122 Languages

State of open video generation models in Diffusers

Recoloring photos with diffusers

The 5 Most Under-Rated Tools on Hugging Face

Design choices for Vision Language Models in 2024