metadata

title: ColPali Visual Retrieval
emoji: 🔍
colorFrom: green
colorTo: blue
sdk: docker
sdk_version: '3.11'
app_file: app.py
pinned: false

ColPali Visual Retrieval with Vespa

A powerful visual document retrieval system that combines ColPali (Contextual Late Interaction with Patch-level Information) with Vespa for scalable, intelligent document search and question-answering.

🌟 Features

Visual Document Search: Search through PDF documents using natural language queries
Token-level Similarity Maps: Visualize exactly which parts of documents match your query
AI-Powered Chat: Ask questions about retrieved documents using Google Gemini
Multiple Ranking Methods: Choose between ColPali, BM25, or Hybrid ranking

🚀 Try It Out

Enter a natural language query in the search box
Select your preferred ranking method
Click on token buttons to see visual attention maps
Ask follow-up questions in the chat interface

📄 Sample Queries

"Pie chart with model comparison"
"Speaker diarization evaluation"
"Results table from dense retrieval"
"Graph showing training loss"
"Architecture diagram with transformer"

🛠️ Technology Stack

ColPali: Visual-language model for document understanding
Vespa: Distributed search engine for scalability
FastHTML: Modern web framework for the UI
Google Gemini: AI-powered question answering

📊 About the Dataset

This demo uses ~400 pages from AI-related research papers published in 2024. The documents are processed using ColPali to create visual embeddings that enable semantic search across document images.

Spaces:

vk98
/

colpali-visual-retrieval

Build error