Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
| # Your leaderboard name | |
| TITLE = """<h1 align="center" id="space-title">AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark | |
| (Preview) </h1>""" | |
| # What does your leaderboard evaluate? | |
| INTRODUCTION_TEXT = """ | |
| Check more information at [our GitHub repo](https://github.com/AIR-Bench/AIR-Bench) | |
| """ | |
| # Which evaluations are you running? how can people reproduce what you have? | |
| BENCHMARKS_TEXT = f""" | |
| ## How it works | |
| Check more information at [our GitHub repo](https://github.com/AIR-Bench/AIR-Bench) | |
| """ | |
| EVALUATION_QUEUE_TEXT = """ | |
| ## Steps for submit to AIR-Bench | |
| 1. Install AIR-Bench | |
| ```bash | |
| pip install air-benchmark | |
| ``` | |
| 2. Run the evaluation script | |
| ```bash | |
| cd AIR-Bench/scripts | |
| # Run all tasks | |
| python run_air_benchmark.py \\ | |
| --output_dir ./search_results \\ | |
| --encoder BAAI/bge-m3 \\ | |
| --reranker BAAI/bge-reranker-v2-m3 \\ | |
| --search_top_k 1000 \\ | |
| --rerank_top_k 100 \\ | |
| --max_query_length 512 \\ | |
| --max_passage_length 512 \\ | |
| --batch_size 512 \\ | |
| --pooling_method cls \\ | |
| --normalize_embeddings True \\ | |
| --use_fp16 True \\ | |
| --add_instruction False \\ | |
| --overwrite False | |
| # Run the tasks in the specified task type | |
| python run_air_benchmark.py \\ | |
| --task_types long-doc \\ | |
| --output_dir ./search_results \\ | |
| --encoder BAAI/bge-m3 \\ | |
| --reranker BAAI/bge-reranker-v2-m3 \\ | |
| --search_top_k 1000 \\ | |
| --rerank_top_k 100 \\ | |
| --max_query_length 512 \\ | |
| --max_passage_length 512 \\ | |
| --batch_size 512 \\ | |
| --pooling_method cls \\ | |
| --normalize_embeddings True \\ | |
| --use_fp16 True \\ | |
| --add_instruction False \\ | |
| --overwrite False | |
| # Run the tasks in the specified task type and domains | |
| python run_air_benchmark.py \\ | |
| --task_types long-doc \\ | |
| --domains arxiv book \\ | |
| --output_dir ./search_results \\ | |
| --encoder BAAI/bge-m3 \\ | |
| --reranker BAAI/bge-reranker-v2-m3 \\ | |
| --search_top_k 1000 \\ | |
| --rerank_top_k 100 \\ | |
| --max_query_length 512 \\ | |
| --max_passage_length 512 \\ | |
| --batch_size 512 \\ | |
| --pooling_method cls \\ | |
| --normalize_embeddings True \\ | |
| --use_fp16 True \\ | |
| --add_instruction False \\ | |
| --overwrite False | |
| # Run the tasks in the specified languages | |
| python run_air_benchmark.py \\ | |
| --languages en \\ | |
| --output_dir ./search_results \\ | |
| --encoder BAAI/bge-m3 \\ | |
| --reranker BAAI/bge-reranker-v2-m3 \\ | |
| --search_top_k 1000 \\ | |
| --rerank_top_k 100 \\ | |
| --max_query_length 512 \\ | |
| --max_passage_length 512 \\ | |
| --batch_size 512 \\ | |
| --pooling_method cls \\ | |
| --normalize_embeddings True \\ | |
| --use_fp16 True \\ | |
| --add_instruction False \\ | |
| --overwrite False | |
| # Run the tasks in the specified task type, domains, and languages | |
| python run_air_benchmark.py \\ | |
| --task_types qa \\ | |
| --domains wiki web \\ | |
| --languages en \\ | |
| --output_dir ./search_results \\ | |
| --encoder BAAI/bge-m3 \\ | |
| --reranker BAAI/bge-reranker-v2-m3 \\ | |
| --search_top_k 1000 \\ | |
| --rerank_top_k 100 \\ | |
| --max_query_length 512 \\ | |
| --max_passage_length 512 \\ | |
| --batch_size 512 \\ | |
| --pooling_method cls \\ | |
| --normalize_embeddings True \\ | |
| --use_fp16 True \\ | |
| --add_instruction False \\ | |
| --overwrite False | |
| ``` | |
| 3. Package the search results. | |
| ```bash | |
| # Zip "Embedding Model + NoReranker" search results in "<search_results>/<model_name>/NoReranker" to "<save_dir>/<model_name>_NoReranker.zip". | |
| python zip_results.py \\ | |
| --results_dir search_results \\ | |
| --model_name bge-m3 \\ | |
| --save_dir search_results/zipped_results | |
| # Zip "Embedding Model + Reranker" search results in "<search_results>/<model_name>/<reranker_name>" to "<save_dir>/<model_name>_<reranker_name>.zip". | |
| python zip_results.py \\ | |
| --results_dir search_results \\ | |
| --model_name bge-m3 \\ | |
| --reranker_name bge-reranker-v2-m3 \\ | |
| --save_dir search_results/zipped_results | |
| ``` | |
| 4. Upload the `.zip` file on this page and fill in the model information: | |
| - Model Name: such as `bge-m3`. | |
| - Model URL: such as `https://huggingface.co/BAAI/bge-m3`. | |
| - Reranker Name: such as `bge-reranker-v2-m3`. Keep empty for `NoReranker`. | |
| - Reranker URL: such as `https://huggingface.co/BAAI/bge-reranker-v2-m3`. Keep empty for `NoReranker`. | |
| If you want to stay anonymous, you can only fill in the Model Name and Reranker Name (keep empty for `NoReranker`), and check the selection box below befor submission. | |
| 5. Congratulation! Your results will be shown on the leaderboard in up to one hour. | |
| """ | |
| CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results" | |
| CITATION_BUTTON_TEXT = r""" | |
| """ | |