vk98's picture
Initial deployment of ColPali Visual Retrieval backend
a54266b
metadata
title: ColPali Visual Retrieval
emoji: πŸ”
colorFrom: green
colorTo: blue
sdk: docker
sdk_version: '3.11'
app_file: app.py
pinned: false

ColPali Visual Retrieval with Vespa

A powerful visual document retrieval system that combines ColPali (Contextual Late Interaction with Patch-level Information) with Vespa for scalable, intelligent document search and question-answering.

🌟 Features

  • Visual Document Search: Search through PDF documents using natural language queries
  • Token-level Similarity Maps: Visualize exactly which parts of documents match your query
  • AI-Powered Chat: Ask questions about retrieved documents using Google Gemini
  • Multiple Ranking Methods: Choose between ColPali, BM25, or Hybrid ranking

πŸš€ Try It Out

  1. Enter a natural language query in the search box
  2. Select your preferred ranking method
  3. Click on token buttons to see visual attention maps
  4. Ask follow-up questions in the chat interface

πŸ“„ Sample Queries

  • "Pie chart with model comparison"
  • "Speaker diarization evaluation"
  • "Results table from dense retrieval"
  • "Graph showing training loss"
  • "Architecture diagram with transformer"

πŸ› οΈ Technology Stack

  • ColPali: Visual-language model for document understanding
  • Vespa: Distributed search engine for scalability
  • FastHTML: Modern web framework for the UI
  • Google Gemini: AI-powered question answering

πŸ“Š About the Dataset

This demo uses ~400 pages from AI-related research papers published in 2024. The documents are processed using ColPali to create visual embeddings that enable semantic search across document images.

πŸ”— Links