Spaces:
Build error
Build error
metadata
title: ColPali Visual Retrieval
emoji: π
colorFrom: green
colorTo: blue
sdk: docker
sdk_version: '3.11'
app_file: app.py
pinned: false
ColPali Visual Retrieval with Vespa
A powerful visual document retrieval system that combines ColPali (Contextual Late Interaction with Patch-level Information) with Vespa for scalable, intelligent document search and question-answering.
π Features
- Visual Document Search: Search through PDF documents using natural language queries
- Token-level Similarity Maps: Visualize exactly which parts of documents match your query
- AI-Powered Chat: Ask questions about retrieved documents using Google Gemini
- Multiple Ranking Methods: Choose between ColPali, BM25, or Hybrid ranking
π Try It Out
- Enter a natural language query in the search box
- Select your preferred ranking method
- Click on token buttons to see visual attention maps
- Ask follow-up questions in the chat interface
π Sample Queries
- "Pie chart with model comparison"
- "Speaker diarization evaluation"
- "Results table from dense retrieval"
- "Graph showing training loss"
- "Architecture diagram with transformer"
π οΈ Technology Stack
- ColPali: Visual-language model for document understanding
- Vespa: Distributed search engine for scalability
- FastHTML: Modern web framework for the UI
- Google Gemini: AI-powered question answering
π About the Dataset
This demo uses ~400 pages from AI-related research papers published in 2024. The documents are processed using ColPali to create visual embeddings that enable semantic search across document images.