--- title: MTEB Human Evaluation Demo emoji: 📊 colorFrom: blue colorTo: indigo sdk: gradio sdk_version: 3.42.0 app_file: app.py pinned: false --- # MTEB Human Evaluation Demo This is a demo of the human evaluation interface for the MTEB (Massive Text Embedding Benchmark) project. It allows annotators to evaluate the relevance of documents for reranking tasks. ## How to use 1. Navigate to the "Demo" tab to try the interface with an example dataset (AskUbuntuDupQuestions) 2. Read the query at the top 3. For each document, assign a rank using the dropdown (1 = most relevant) 4. Submit your rankings 5. Navigate between samples using the Previous/Next buttons 6. Your annotations are saved automatically ## About MTEB Human Evaluation This project aims to establish human performance benchmarks for MTEB tasks, helping to understand the realistic "ceiling" for embedding model performance.