Visual Document Retrieval
PEFT
Safetensors
ColPali
vidore
multimodal_embedding
multilingual_embedding
Text-to-Visual Document (T→VD) retrieval
zpn commited on
Commit
59d397b
·
verified ·
1 Parent(s): df35d38

Add files using upload-large-folder tool

Browse files
Files changed (5) hide show
  1. README.md +183 -107
  2. adapter_model.safetensors +1 -1
  3. git_hash.txt +1 -1
  4. results.json +884 -884
  5. training_config.yml +4 -4
README.md CHANGED
@@ -1,126 +1,202 @@
1
  ---
2
- base_model:
3
- - Qwen/Qwen2.5-VL-7B-Instruct
4
  library_name: peft
5
- datasets:
6
- - llamaindex/vdr-multilingual-train
7
- - nomic-ai/colpali_train_set_split_by_source
8
- language:
9
- - en
10
- - it
11
- - fr
12
- - de
13
- - es
14
- pipeline_tag: visual-document-retrieval
15
- tags:
16
- - vidore
17
- - colpali
18
- - multimodal_embedding
19
- - multilingual_embedding
20
- - Text-to-Visual Document (T→VD) retrieval
21
- license: apache-2.0
22
  ---
23
 
24
- # ColNomic Embed Multimodal 7B: State-of-the-Art Visual Document Retrieval
25
 
26
- `colnomic-embed-multimodal-7b` is a multi-vector state-of-the-art multimodal embedding model that excels at visual document retrieval tasks:
27
 
28
- - **High Performance**: Achieves 61.6 NDCG@5 on Vidore-v2, outperforming all other models
29
- - **Unified Text-Image Encoding**: Directly encodes interleaved text and images without complex preprocessing
30
- - **Advanced Architecture**: 7B parameter multimodal embedding model
31
- - **Fully Open-Source**: Model weights available for commercial and research use
32
 
33
- ## Performance
34
 
35
- | Model | Avg. | ESG Restaurant Human | Econ Macro Multi. | AXA Multi. | MIT Bio | ESG Restaurant Synth. | ESG Restaurant Synth. Multi. | MIT Bio Multi. | AXA | Econ. Macro |
36
- |-------|------|----------------------|-------------------|------------|---------|----------------------|----------------------------|---------------|-----|------------|
37
- | **ColNomic Embed Multimodal 7B** | 61.6 | 67.1 | 54.9 | 59.7 | 66.1 | 58.1 | 56.9 | 64.3 | 67.4 | 59.6 |
38
- | ColNomic Embed Multimodal 3B | 61.2 | 65.8 | 55.4 | 61.0 | 63.5 | 56.6 | 57.2 | 62.5 | 68.8 | 60.2 |
39
- | T-Systems ColQwen2.5-3B | 59.9 | 72.1 | 51.2 | 60.0 | 65.3 | 51.7 | 53.3 | 61.7 | 69.3 | 54.8 |
40
- | Nomic Embed Multimodal 7B | 59.7 | 65.7 | 57.7 | 59.3 | 64.0 | 49.2 | 51.9 | 61.2 | 66.3 | 63.1 |
41
- | Nomic Embed Multimodal 3B | 58.8 | 59.8 | 57.5 | 58.8 | 62.5 | 49.4 | 49.4 | 58.6 | 69.6 | 63.5 |
42
- | Llama Index vdr-2b-multi-v1 | 58.4 | 63.1 | 52.8 | 61.0 | 60.6 | 50.3 | 51.2 | 56.9 | 68.8 | 61.2 |
43
- | Voyage Multimodal 3 | 55.0 | 56.1 | 55.0 | 59.5 | 56.4 | 47.2 | 46.2 | 51.5 | 64.1 | 58.8 |
44
 
45
- ## Model Architecture
46
 
47
- - **Total Parameters**: 7B
48
- - **Training Approach**: Fine-tuned from Qwen2.5-VL 7B Instruct
49
- - **Architecture Type**: Vision-Language Model with unified text and image input processing
50
- - **Key Innovations**:
51
- - Same-source sampling to create harder in-batch negatives
52
- - Multi-vector output option for enhanced performance
53
 
54
- ## Integration with RAG Workflows
55
 
56
- Nomic Embed Multimodal 7B seamlessly integrates with Retrieval Augmented Generation (RAG) workflows:
57
 
58
- 1. **Direct Document Embedding**: Skip OCR and complex processing by directly embedding document page images
59
- 2. **Faster Processing**: Eliminate preprocessing steps for quicker indexing
60
- 3. **More Complete Information**: Capture both textual and visual cues in a single embedding
61
- 4. **Simple Implementation**: Use the same API for both text and images
 
 
 
62
 
63
- ## Recommended Use Cases
64
 
65
- The model excels at handling real-world document retrieval scenarios that challenge traditional text-only systems:
66
 
67
- - **Research Papers**: Capture equations, diagrams, and tables
68
- - **Technical Documentation**: Encode code blocks, flowcharts, and screenshots
69
- - **Product Catalogs**: Represent images, specifications, and pricing tables
70
- - **Financial Reports**: Embed charts, graphs, and numerical data
71
- - **Visually Rich Content**: Where layout and visual information are important
72
- - **Multilingual Documents**: Where visual context provides important cues
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
73
 
74
  ## Training Details
75
 
76
- ColNomic Embed Multimodal 7B was developed through several key innovations:
77
-
78
- 1. **Sampling From the Same Source**: Forcing sampling from the same dataset source creates harder in-batch negatives, preventing the model from learning dataset artifacts.
79
-
80
- 2. **Multi-Vector Configuration**: Providing a multi-vector variant that achieves higher performance than the dense variant.
81
-
82
- ## Limitations
83
-
84
- - Performance may vary when processing documents with unconventional layouts or unusual visual elements
85
- - While it handles multiple languages, performance is strongest on English content
86
- - Processing very large or complex documents may require dividing them into smaller chunks
87
- - Performance on documents with handwriting or heavily stylized fonts may be reduced
88
-
89
- ## Join the Nomic Community
90
-
91
- - Nomic Embed Ecosystem: [https://www.nomic.ai/embed](https://www.nomic.ai/embed)
92
- - Website: [https://nomic.ai](https://nomic.ai)
93
- - Twitter: [https://twitter.com/nomic_ai](https://twitter.com/nomic_ai)
94
- - Discord: [https://discord.gg/myY5YDR8z8](https://discord.gg/myY5YDR8z8)
95
-
96
- ## Citation
97
-
98
- If you find this model useful in your research or applications, please consider citing:
99
-
100
- ```bibtex
101
- @misc{faysse2024colpaliefficientdocumentretrieval,
102
- title={ColPali: Efficient Document Retrieval with Vision Language Models},
103
- author={Manuel Faysse and Hugues Sibille and Tony Wu and Bilel Omrani and Gautier Viaud and Céline Hudelot and Pierre Colombo},
104
- year={2024},
105
- eprint={2407.01449},
106
- archivePrefix={arXiv},
107
- primaryClass={cs.IR},
108
- url={https://arxiv.org/abs/2407.01449},
109
- }
110
- @misc{ma2024unifyingmultimodalretrievaldocument,
111
- title={Unifying Multimodal Retrieval via Document Screenshot Embedding},
112
- author={Xueguang Ma and Sheng-Chieh Lin and Minghan Li and Wenhu Chen and Jimmy Lin},
113
- year={2024},
114
- eprint={2406.11251},
115
- archivePrefix={arXiv},
116
- primaryClass={cs.IR},
117
- url={https://arxiv.org/abs/2406.11251},
118
- }
119
- @misc{nomicembedmultimodal2025,
120
- title={Nomic Embed Multimodal: Interleaved Text, Image, and Screenshots for Visual Document Retrieval},
121
- author={Nomic Team},
122
- year={2025},
123
- publisher={Nomic AI},
124
- url={https://nomic-ai/blog/posts/nomic-embed-multimodal},
125
- }
126
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model: Qwen/Qwen2.5-VL-7B-Instruct
 
3
  library_name: peft
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  ---
5
 
6
+ # Model Card for Model ID
7
 
8
+ <!-- Provide a quick summary of what the model is/does. -->
9
 
 
 
 
 
10
 
 
11
 
12
+ ## Model Details
 
 
 
 
 
 
 
 
13
 
14
+ ### Model Description
15
 
16
+ <!-- Provide a longer summary of what this model is. -->
 
 
 
 
 
17
 
 
18
 
 
19
 
20
+ - **Developed by:** [More Information Needed]
21
+ - **Funded by [optional]:** [More Information Needed]
22
+ - **Shared by [optional]:** [More Information Needed]
23
+ - **Model type:** [More Information Needed]
24
+ - **Language(s) (NLP):** [More Information Needed]
25
+ - **License:** [More Information Needed]
26
+ - **Finetuned from model [optional]:** [More Information Needed]
27
 
28
+ ### Model Sources [optional]
29
 
30
+ <!-- Provide the basic links for the model. -->
31
 
32
+ - **Repository:** [More Information Needed]
33
+ - **Paper [optional]:** [More Information Needed]
34
+ - **Demo [optional]:** [More Information Needed]
35
+
36
+ ## Uses
37
+
38
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
+
40
+ ### Direct Use
41
+
42
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
+
44
+ [More Information Needed]
45
+
46
+ ### Downstream Use [optional]
47
+
48
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
+
50
+ [More Information Needed]
51
+
52
+ ### Out-of-Scope Use
53
+
54
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
+
56
+ [More Information Needed]
57
+
58
+ ## Bias, Risks, and Limitations
59
+
60
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
+
62
+ [More Information Needed]
63
+
64
+ ### Recommendations
65
+
66
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
+
68
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
+
70
+ ## How to Get Started with the Model
71
+
72
+ Use the code below to get started with the model.
73
+
74
+ [More Information Needed]
75
 
76
  ## Training Details
77
 
78
+ ### Training Data
79
+
80
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
+
82
+ [More Information Needed]
83
+
84
+ ### Training Procedure
85
+
86
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
+
88
+ #### Preprocessing [optional]
89
+
90
+ [More Information Needed]
91
+
92
+
93
+ #### Training Hyperparameters
94
+
95
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
+
97
+ #### Speeds, Sizes, Times [optional]
98
+
99
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
+
101
+ [More Information Needed]
102
+
103
+ ## Evaluation
104
+
105
+ <!-- This section describes the evaluation protocols and provides the results. -->
106
+
107
+ ### Testing Data, Factors & Metrics
108
+
109
+ #### Testing Data
110
+
111
+ <!-- This should link to a Dataset Card if possible. -->
112
+
113
+ [More Information Needed]
114
+
115
+ #### Factors
116
+
117
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
+
119
+ [More Information Needed]
120
+
121
+ #### Metrics
122
+
123
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
+
125
+ [More Information Needed]
126
+
127
+ ### Results
128
+
129
+ [More Information Needed]
130
+
131
+ #### Summary
132
+
133
+
134
+
135
+ ## Model Examination [optional]
136
+
137
+ <!-- Relevant interpretability work for the model goes here -->
138
+
139
+ [More Information Needed]
140
+
141
+ ## Environmental Impact
142
+
143
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
+
145
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
+
147
+ - **Hardware Type:** [More Information Needed]
148
+ - **Hours used:** [More Information Needed]
149
+ - **Cloud Provider:** [More Information Needed]
150
+ - **Compute Region:** [More Information Needed]
151
+ - **Carbon Emitted:** [More Information Needed]
152
+
153
+ ## Technical Specifications [optional]
154
+
155
+ ### Model Architecture and Objective
156
+
157
+ [More Information Needed]
158
+
159
+ ### Compute Infrastructure
160
+
161
+ [More Information Needed]
162
+
163
+ #### Hardware
164
+
165
+ [More Information Needed]
166
+
167
+ #### Software
168
+
169
+ [More Information Needed]
170
+
171
+ ## Citation [optional]
172
+
173
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
+
175
+ **BibTeX:**
176
+
177
+ [More Information Needed]
178
+
179
+ **APA:**
180
+
181
+ [More Information Needed]
182
+
183
+ ## Glossary [optional]
184
+
185
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
+
187
+ [More Information Needed]
188
+
189
+ ## More Information [optional]
190
+
191
+ [More Information Needed]
192
+
193
+ ## Model Card Authors [optional]
194
+
195
+ [More Information Needed]
196
+
197
+ ## Model Card Contact
198
+
199
+ [More Information Needed]
200
+ ### Framework versions
201
+
202
+ - PEFT 0.14.0
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:95ff4fa553f4adaf07eaba27c2803b6293c8824250361fe76d79319e0aba391d
3
  size 323489536
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:aa6cba28368342ac6b235f49fcc3e0a5bd891d56df490c12bcbc6dc52c8377f3
3
  size 323489536
git_hash.txt CHANGED
@@ -1 +1 @@
1
- e4785653a193e4da012e058996429c98478c2c58
 
1
+ d3f780b24ba870601a046b4f35d28d827739446c
results.json CHANGED
@@ -1,228 +1,228 @@
1
  {
2
  "metadata": {
3
- "timestamp": "2025-03-29T04:54:33.560612",
4
  "vidore_benchmark_version": "5.0.1.dev6+g9e0da63"
5
  },
6
  "metrics": {
7
  "vidore/arxivqa_test_subsampled": {
8
- "ndcg_at_1": 0.846,
9
- "ndcg_at_3": 0.88855,
10
- "ndcg_at_5": 0.89268,
11
- "ndcg_at_10": 0.90254,
12
- "ndcg_at_20": 0.90708,
13
- "ndcg_at_50": 0.91061,
14
- "ndcg_at_100": 0.91093,
15
- "map_at_1": 0.846,
16
- "map_at_3": 0.87833,
17
- "map_at_5": 0.88063,
18
- "map_at_10": 0.8848,
19
- "map_at_20": 0.88604,
20
- "map_at_50": 0.88659,
21
- "map_at_100": 0.88661,
22
- "recall_at_1": 0.846,
23
- "recall_at_3": 0.918,
24
- "recall_at_5": 0.928,
25
  "recall_at_10": 0.958,
26
- "recall_at_20": 0.976,
27
- "recall_at_50": 0.994,
28
- "recall_at_100": 0.996,
29
- "precision_at_1": 0.846,
30
- "precision_at_3": 0.306,
31
- "precision_at_5": 0.1856,
32
  "precision_at_10": 0.0958,
33
- "precision_at_20": 0.0488,
34
- "precision_at_50": 0.01988,
35
- "precision_at_100": 0.00996,
36
- "mrr_at_1": 0.844,
37
- "mrr_at_3": 0.8759999999999999,
38
- "mrr_at_5": 0.8800999999999998,
39
- "mrr_at_10": 0.8837873015873016,
40
- "mrr_at_20": 0.884978105752177,
41
- "mrr_at_50": 0.8854220490930644,
42
- "mrr_at_100": 0.8854514608577703,
43
- "naucs_at_1_max": 0.8450125179508687,
44
- "naucs_at_1_std": 0.12069338010090619,
45
- "naucs_at_1_diff1": 0.9621919758028649,
46
- "naucs_at_3_max": 0.8291088793240907,
47
- "naucs_at_3_std": 0.04603630069914268,
48
- "naucs_at_3_diff1": 0.9314522556990277,
49
- "naucs_at_5_max": 0.8539267558875412,
50
- "naucs_at_5_std": 0.008105093889405354,
51
- "naucs_at_5_diff1": 0.9332788671023967,
52
- "naucs_at_10_max": 0.8914010048463799,
53
- "naucs_at_10_std": 0.049175225645809645,
54
- "naucs_at_10_diff1": 0.9502023031434798,
55
- "naucs_at_20_max": 0.9564270152505446,
56
- "naucs_at_20_std": 0.01746809835045214,
57
- "naucs_at_20_diff1": 0.9673202614379051,
58
- "naucs_at_50_max": 0.9564270152505304,
59
- "naucs_at_50_std": 0.08667911609088286,
60
- "naucs_at_50_diff1": 1.0,
61
- "naucs_at_100_max": 0.9346405228758466,
62
- "naucs_at_100_std": -0.3699813258636757,
63
- "naucs_at_100_diff1": 1.0
64
  },
65
  "vidore/docvqa_test_subsampled": {
66
- "ndcg_at_1": 0.5255,
67
- "ndcg_at_3": 0.59022,
68
- "ndcg_at_5": 0.6095,
69
- "ndcg_at_10": 0.62215,
70
- "ndcg_at_20": 0.63179,
71
- "ndcg_at_50": 0.64578,
72
- "ndcg_at_100": 0.65236,
73
- "map_at_1": 0.5255,
74
- "map_at_3": 0.57428,
75
- "map_at_5": 0.58503,
76
- "map_at_10": 0.5901,
77
- "map_at_20": 0.59281,
78
- "map_at_50": 0.5952,
79
- "map_at_100": 0.59581,
80
- "recall_at_1": 0.5255,
81
  "recall_at_3": 0.63636,
82
- "recall_at_5": 0.68293,
83
- "recall_at_10": 0.72284,
84
- "recall_at_20": 0.76053,
85
- "recall_at_50": 0.82927,
86
- "recall_at_100": 0.86918,
87
- "precision_at_1": 0.5255,
88
  "precision_at_3": 0.21212,
89
- "precision_at_5": 0.13659,
90
- "precision_at_10": 0.07228,
91
- "precision_at_20": 0.03803,
92
- "precision_at_50": 0.01659,
93
- "precision_at_100": 0.00869,
94
- "mrr_at_1": 0.5277161862527716,
95
- "mrr_at_3": 0.5705838876570585,
96
- "mrr_at_5": 0.5818920916481892,
97
- "mrr_at_10": 0.5882844120648999,
98
- "mrr_at_20": 0.5910046981583916,
99
- "mrr_at_50": 0.5934130734663197,
100
- "mrr_at_100": 0.5940080973172209,
101
- "naucs_at_1_max": 0.8434550008318404,
102
- "naucs_at_1_std": 0.6208689692801249,
103
- "naucs_at_1_diff1": 0.9385878043485472,
104
- "naucs_at_3_max": 0.8863600274794795,
105
- "naucs_at_3_std": 0.7125812271959884,
106
- "naucs_at_3_diff1": 0.8562659960495058,
107
- "naucs_at_5_max": 0.8957204704811382,
108
- "naucs_at_5_std": 0.7815099852986634,
109
- "naucs_at_5_diff1": 0.840615203999516,
110
- "naucs_at_10_max": 0.8805933290684814,
111
- "naucs_at_10_std": 0.8036536055359992,
112
- "naucs_at_10_diff1": 0.8108491152172002,
113
- "naucs_at_20_max": 0.8758585787087045,
114
- "naucs_at_20_std": 0.8547041593366219,
115
- "naucs_at_20_diff1": 0.7963142602467446,
116
- "naucs_at_50_max": 0.835882926621152,
117
- "naucs_at_50_std": 0.8642924736770558,
118
- "naucs_at_50_diff1": 0.7550004769464399,
119
- "naucs_at_100_max": 0.8387505235524102,
120
- "naucs_at_100_std": 0.891890167597154,
121
- "naucs_at_100_diff1": 0.7425247689508792
122
  },
123
  "vidore/infovqa_test_subsampled": {
124
- "ndcg_at_1": 0.90283,
125
- "ndcg_at_3": 0.92647,
126
- "ndcg_at_5": 0.93144,
127
- "ndcg_at_10": 0.93484,
128
- "ndcg_at_20": 0.93781,
129
- "ndcg_at_50": 0.93953,
130
- "ndcg_at_100": 0.94056,
131
- "map_at_1": 0.90283,
132
- "map_at_3": 0.92004,
133
- "map_at_5": 0.92277,
134
- "map_at_10": 0.92425,
135
- "map_at_20": 0.92501,
136
- "map_at_50": 0.92533,
137
- "map_at_100": 0.92544,
138
- "recall_at_1": 0.90283,
139
- "recall_at_3": 0.94534,
140
- "recall_at_5": 0.95749,
141
- "recall_at_10": 0.96761,
142
- "recall_at_20": 0.97976,
143
- "recall_at_50": 0.98785,
144
- "recall_at_100": 0.99393,
145
- "precision_at_1": 0.90283,
146
- "precision_at_3": 0.31511,
147
- "precision_at_5": 0.1915,
148
- "precision_at_10": 0.09676,
149
- "precision_at_20": 0.04899,
150
- "precision_at_50": 0.01976,
151
- "precision_at_100": 0.00994,
152
- "mrr_at_1": 0.902834008097166,
153
- "mrr_at_3": 0.9193657219973007,
154
- "mrr_at_5": 0.922098515519568,
155
- "mrr_at_10": 0.9235404215667373,
156
- "mrr_at_20": 0.924306062896783,
157
- "mrr_at_50": 0.9246081217189239,
158
- "mrr_at_100": 0.9247108914770626,
159
- "naucs_at_1_max": 0.7209924833135752,
160
- "naucs_at_1_std": 0.12016074691763526,
161
- "naucs_at_1_diff1": 0.9700705001101573,
162
- "naucs_at_3_max": 0.7991816729164284,
163
- "naucs_at_3_std": 0.3632744107706446,
164
- "naucs_at_3_diff1": 0.9806516364348475,
165
- "naucs_at_5_max": 0.7968086782065312,
166
- "naucs_at_5_std": 0.420630802871265,
167
- "naucs_at_5_diff1": 0.9751235325590942,
168
- "naucs_at_10_max": 0.921101425024486,
169
- "naucs_at_10_std": 0.6268094022259476,
170
- "naucs_at_10_diff1": 0.9755122273628534,
171
- "naucs_at_20_max": 0.9461062354112437,
172
- "naucs_at_20_std": 0.8385150814353919,
173
- "naucs_at_20_diff1": 0.9869398545935244,
174
- "naucs_at_50_max": 0.9319439680295378,
175
- "naucs_at_50_std": 0.8051426050527639,
176
- "naucs_at_50_diff1": 0.9782330909892136,
177
- "naucs_at_100_max": 0.9074217540806789,
178
- "naucs_at_100_std": 0.7588534820930969,
179
- "naucs_at_100_diff1": 0.9564661819784259
180
  },
181
  "vidore/tabfquad_test_subsampled": {
182
- "ndcg_at_1": 0.90714,
183
- "ndcg_at_3": 0.9463,
184
- "ndcg_at_5": 0.9506,
185
- "ndcg_at_10": 0.95517,
186
- "ndcg_at_20": 0.95606,
187
- "ndcg_at_50": 0.95606,
188
- "ndcg_at_100": 0.95606,
189
- "map_at_1": 0.90714,
190
- "map_at_3": 0.9375,
191
- "map_at_5": 0.93982,
192
- "map_at_10": 0.94168,
193
- "map_at_20": 0.94192,
194
- "map_at_50": 0.94192,
195
- "map_at_100": 0.94192,
196
- "recall_at_1": 0.90714,
197
- "recall_at_3": 0.97143,
198
- "recall_at_5": 0.98214,
199
- "recall_at_10": 0.99643,
200
  "recall_at_20": 1.0,
201
  "recall_at_50": 1.0,
202
  "recall_at_100": 1.0,
203
- "precision_at_1": 0.90714,
204
- "precision_at_3": 0.32381,
205
- "precision_at_5": 0.19643,
206
- "precision_at_10": 0.09964,
207
  "precision_at_20": 0.05,
208
  "precision_at_50": 0.02,
209
  "precision_at_100": 0.01,
210
- "mrr_at_1": 0.9035714285714286,
211
- "mrr_at_3": 0.9363095238095239,
212
- "mrr_at_5": 0.9393452380952383,
213
- "mrr_at_10": 0.940609410430839,
214
- "mrr_at_20": 0.9408475056689343,
215
- "mrr_at_50": 0.9408475056689343,
216
- "mrr_at_100": 0.9408475056689343,
217
- "naucs_at_1_max": 0.4907886231415652,
218
- "naucs_at_1_std": 0.07367305896717852,
219
- "naucs_at_1_diff1": 0.9411226028873094,
220
- "naucs_at_3_max": 0.9325980392156865,
221
- "naucs_at_3_std": 0.8605275443510717,
222
- "naucs_at_3_diff1": 0.9836601307189589,
223
- "naucs_at_5_max": 0.9477124183006519,
224
- "naucs_at_5_std": 0.8921568627451019,
225
- "naucs_at_5_diff1": 1.0,
226
  "naucs_at_10_max": 1.0,
227
  "naucs_at_10_std": 1.0,
228
  "naucs_at_10_diff1": 1.0,
@@ -237,159 +237,159 @@
237
  "naucs_at_100_diff1": 1.0
238
  },
239
  "vidore/tatdqa_test": {
240
- "ndcg_at_1": 0.71021,
241
- "ndcg_at_3": 0.79884,
242
- "ndcg_at_5": 0.81895,
243
- "ndcg_at_10": 0.82925,
244
- "ndcg_at_20": 0.83272,
245
- "ndcg_at_50": 0.8365,
246
- "ndcg_at_100": 0.83879,
247
- "map_at_1": 0.71021,
248
- "map_at_3": 0.77754,
249
- "map_at_5": 0.78875,
250
- "map_at_10": 0.79305,
251
- "map_at_20": 0.79406,
252
- "map_at_50": 0.79468,
253
- "map_at_100": 0.79489,
254
- "recall_at_1": 0.71021,
255
- "recall_at_3": 0.86027,
256
- "recall_at_5": 0.90887,
257
- "recall_at_10": 0.94046,
258
- "recall_at_20": 0.95383,
259
- "recall_at_50": 0.97266,
260
- "recall_at_100": 0.98663,
261
- "precision_at_1": 0.71021,
262
- "precision_at_3": 0.28676,
263
- "precision_at_5": 0.18177,
264
- "precision_at_10": 0.09405,
265
- "precision_at_20": 0.04769,
266
- "precision_at_50": 0.01945,
267
- "precision_at_100": 0.00987,
268
- "mrr_at_1": 0.7083839611178615,
269
- "mrr_at_3": 0.7776427703523693,
270
- "mrr_at_5": 0.7884568651275823,
271
- "mrr_at_10": 0.7926182279311078,
272
- "mrr_at_20": 0.7938033556784707,
273
- "mrr_at_50": 0.7943286181934197,
274
- "mrr_at_100": 0.794528350656401,
275
- "naucs_at_1_max": 0.32793936285585046,
276
- "naucs_at_1_std": 0.08355391640281296,
277
- "naucs_at_1_diff1": 0.8694893936676271,
278
- "naucs_at_3_max": 0.34310764558519663,
279
- "naucs_at_3_std": 0.18112933511633622,
280
- "naucs_at_3_diff1": 0.7842219442406141,
281
- "naucs_at_5_max": 0.390411682837578,
282
- "naucs_at_5_std": 0.22111733749024254,
283
- "naucs_at_5_diff1": 0.7610946297461103,
284
- "naucs_at_10_max": 0.4254863651763087,
285
- "naucs_at_10_std": 0.3546779556060681,
286
- "naucs_at_10_diff1": 0.741877974565238,
287
- "naucs_at_20_max": 0.4767731303920113,
288
- "naucs_at_20_std": 0.43791968327264075,
289
- "naucs_at_20_diff1": 0.752451856406408,
290
- "naucs_at_50_max": 0.4552115757814973,
291
- "naucs_at_50_std": 0.4781223784163035,
292
- "naucs_at_50_diff1": 0.7823468200671793,
293
- "naucs_at_100_max": 0.4328728795889847,
294
- "naucs_at_100_std": 0.6143216368426023,
295
- "naucs_at_100_diff1": 0.7683765100159515
296
  },
297
  "vidore/shiftproject_test": {
298
- "ndcg_at_1": 0.83,
299
- "ndcg_at_3": 0.90809,
300
- "ndcg_at_5": 0.91671,
301
- "ndcg_at_10": 0.91986,
302
- "ndcg_at_20": 0.91986,
303
- "ndcg_at_50": 0.92166,
304
- "ndcg_at_100": 0.92166,
305
- "map_at_1": 0.83,
306
- "map_at_3": 0.89,
307
- "map_at_5": 0.895,
308
- "map_at_10": 0.89625,
309
- "map_at_20": 0.89625,
310
- "map_at_50": 0.89647,
311
- "map_at_100": 0.89647,
312
- "recall_at_1": 0.83,
313
- "recall_at_3": 0.96,
314
- "recall_at_5": 0.98,
315
  "recall_at_10": 0.99,
316
  "recall_at_20": 0.99,
317
- "recall_at_50": 1.0,
318
  "recall_at_100": 1.0,
319
- "precision_at_1": 0.83,
320
- "precision_at_3": 0.32,
321
- "precision_at_5": 0.196,
322
  "precision_at_10": 0.099,
323
  "precision_at_20": 0.0495,
324
- "precision_at_50": 0.02,
325
  "precision_at_100": 0.01,
326
- "mrr_at_1": 0.85,
327
- "mrr_at_3": 0.9033333333333334,
328
- "mrr_at_5": 0.9083333333333334,
329
- "mrr_at_10": 0.9095833333333334,
330
- "mrr_at_20": 0.9095833333333334,
331
- "mrr_at_50": 0.909810606060606,
332
- "mrr_at_100": 0.909810606060606,
333
- "naucs_at_1_max": 0.007865559876938088,
334
- "naucs_at_1_std": -0.25152377082486815,
335
- "naucs_at_1_diff1": 0.7144308353166549,
336
- "naucs_at_3_max": 0.15744631185807553,
337
- "naucs_at_3_std": -0.24976657329598614,
338
- "naucs_at_3_diff1": 0.7741596638655437,
339
- "naucs_at_5_max": 0.45611577964519334,
340
- "naucs_at_5_std": 0.09337068160597826,
341
- "naucs_at_5_diff1": 0.8692810457516353,
342
- "naucs_at_10_max": 0.5541549953314738,
343
- "naucs_at_10_std": 0.35807656395891135,
344
  "naucs_at_10_diff1": 0.8692810457516413,
345
- "naucs_at_20_max": 0.5541549953314738,
346
- "naucs_at_20_std": 0.35807656395891135,
347
  "naucs_at_20_diff1": 0.8692810457516413,
348
- "naucs_at_50_max": null,
349
- "naucs_at_50_std": null,
350
- "naucs_at_50_diff1": null,
351
  "naucs_at_100_max": null,
352
  "naucs_at_100_std": null,
353
  "naucs_at_100_diff1": null
354
  },
355
  "vidore/syntheticDocQA_artificial_intelligence_test": {
356
- "ndcg_at_1": 0.97,
357
- "ndcg_at_3": 0.98893,
358
- "ndcg_at_5": 0.98893,
359
- "ndcg_at_10": 0.98893,
360
- "ndcg_at_20": 0.98893,
361
- "ndcg_at_50": 0.98893,
362
- "ndcg_at_100": 0.98893,
363
- "map_at_1": 0.97,
364
- "map_at_3": 0.985,
365
- "map_at_5": 0.985,
366
- "map_at_10": 0.985,
367
- "map_at_20": 0.985,
368
- "map_at_50": 0.985,
369
- "map_at_100": 0.985,
370
- "recall_at_1": 0.97,
371
  "recall_at_3": 1.0,
372
  "recall_at_5": 1.0,
373
  "recall_at_10": 1.0,
374
  "recall_at_20": 1.0,
375
  "recall_at_50": 1.0,
376
  "recall_at_100": 1.0,
377
- "precision_at_1": 0.97,
378
  "precision_at_3": 0.33333,
379
  "precision_at_5": 0.2,
380
  "precision_at_10": 0.1,
381
  "precision_at_20": 0.05,
382
  "precision_at_50": 0.02,
383
  "precision_at_100": 0.01,
384
- "mrr_at_1": 0.98,
385
- "mrr_at_3": 0.99,
386
- "mrr_at_5": 0.99,
387
- "mrr_at_10": 0.99,
388
- "mrr_at_20": 0.99,
389
- "mrr_at_50": 0.99,
390
- "mrr_at_100": 0.99,
391
- "naucs_at_1_max": 0.47945845004668913,
392
- "naucs_at_1_std": -0.8445378151260531,
393
  "naucs_at_1_diff1": 1.0,
394
  "naucs_at_3_max": 1.0,
395
  "naucs_at_3_std": 1.0,
@@ -413,18 +413,18 @@
413
  "vidore/syntheticDocQA_energy_test": {
414
  "ndcg_at_1": 0.95,
415
  "ndcg_at_3": 0.95631,
416
- "ndcg_at_5": 0.96448,
417
- "ndcg_at_10": 0.96749,
418
- "ndcg_at_20": 0.96749,
419
- "ndcg_at_50": 0.96965,
420
- "ndcg_at_100": 0.96965,
421
  "map_at_1": 0.95,
422
  "map_at_3": 0.955,
423
- "map_at_5": 0.9595,
424
- "map_at_10": 0.96061,
425
- "map_at_20": 0.96061,
426
- "map_at_50": 0.96103,
427
- "map_at_100": 0.96103,
428
  "recall_at_1": 0.95,
429
  "recall_at_3": 0.96,
430
  "recall_at_5": 0.98,
@@ -441,24 +441,24 @@
441
  "precision_at_100": 0.01,
442
  "mrr_at_1": 0.95,
443
  "mrr_at_3": 0.955,
444
- "mrr_at_5": 0.9595,
445
- "mrr_at_10": 0.9606111111111111,
446
- "mrr_at_20": 0.9606111111111111,
447
- "mrr_at_50": 0.9610277777777777,
448
- "mrr_at_100": 0.9610277777777777,
449
- "naucs_at_1_max": 0.2974789915966388,
450
- "naucs_at_1_std": -0.6985060690943007,
451
  "naucs_at_1_diff1": 1.0,
452
- "naucs_at_3_max": 0.8068394024276336,
453
- "naucs_at_3_std": -0.5852007469654534,
454
  "naucs_at_3_diff1": 1.0,
455
- "naucs_at_5_max": 0.9346405228758136,
456
- "naucs_at_5_std": -0.661531279178339,
457
  "naucs_at_5_diff1": 1.0,
458
- "naucs_at_10_max": 0.8692810457516413,
459
  "naucs_at_10_std": -1.1517273576097316,
460
  "naucs_at_10_diff1": 1.0,
461
- "naucs_at_20_max": 0.8692810457516413,
462
  "naucs_at_20_std": -1.1517273576097316,
463
  "naucs_at_20_diff1": 1.0,
464
  "naucs_at_50_max": null,
@@ -469,50 +469,50 @@
469
  "naucs_at_100_diff1": null
470
  },
471
  "vidore/syntheticDocQA_government_reports_test": {
472
- "ndcg_at_1": 0.95,
473
- "ndcg_at_3": 0.97131,
474
- "ndcg_at_5": 0.97131,
475
- "ndcg_at_10": 0.97487,
476
- "ndcg_at_20": 0.97487,
477
- "ndcg_at_50": 0.97487,
478
- "ndcg_at_100": 0.97487,
479
- "map_at_1": 0.95,
480
- "map_at_3": 0.965,
481
- "map_at_5": 0.965,
482
- "map_at_10": 0.96667,
483
- "map_at_20": 0.96667,
484
- "map_at_50": 0.96667,
485
- "map_at_100": 0.96667,
486
- "recall_at_1": 0.95,
487
- "recall_at_3": 0.99,
488
- "recall_at_5": 0.99,
489
  "recall_at_10": 1.0,
490
  "recall_at_20": 1.0,
491
  "recall_at_50": 1.0,
492
  "recall_at_100": 1.0,
493
- "precision_at_1": 0.95,
494
- "precision_at_3": 0.33,
495
- "precision_at_5": 0.198,
496
  "precision_at_10": 0.1,
497
  "precision_at_20": 0.05,
498
  "precision_at_50": 0.02,
499
  "precision_at_100": 0.01,
500
- "mrr_at_1": 0.95,
501
- "mrr_at_3": 0.965,
502
- "mrr_at_5": 0.965,
503
- "mrr_at_10": 0.9666666666666667,
504
- "mrr_at_20": 0.9666666666666667,
505
- "mrr_at_50": 0.9666666666666667,
506
- "mrr_at_100": 0.9666666666666667,
507
- "naucs_at_1_max": 0.8888888888888874,
508
- "naucs_at_1_std": 0.35079365079365044,
509
- "naucs_at_1_diff1": 0.9477124183006508,
510
- "naucs_at_3_max": 1.0,
511
- "naucs_at_3_std": 0.12278244631183229,
512
- "naucs_at_3_diff1": 1.0,
513
  "naucs_at_5_max": 1.0,
514
- "naucs_at_5_std": 0.12278244631185926,
515
- "naucs_at_5_diff1": 1.0,
516
  "naucs_at_10_max": 1.0,
517
  "naucs_at_10_std": 1.0,
518
  "naucs_at_10_diff1": 1.0,
@@ -562,8 +562,8 @@
562
  "mrr_at_20": 0.99,
563
  "mrr_at_50": 0.99,
564
  "mrr_at_100": 0.99,
565
- "naucs_at_1_max": 0.795751633986929,
566
- "naucs_at_1_std": 0.4225023342670396,
567
  "naucs_at_1_diff1": 1.0,
568
  "naucs_at_3_max": 1.0,
569
  "naucs_at_3_std": 1.0,
@@ -586,525 +586,525 @@
586
  },
587
  "vidore/synthetic_mit_biomedical_tissue_interactions_unfiltered": {
588
  "ndcg_at_1": 0.66875,
589
- "ndcg_at_3": 0.6417,
590
- "ndcg_at_5": 0.6613,
591
- "ndcg_at_10": 0.69824,
592
- "ndcg_at_20": 0.72188,
593
- "ndcg_at_50": 0.74328,
594
- "ndcg_at_100": 0.75194,
595
- "map_at_1": 0.4007,
596
- "map_at_3": 0.52943,
597
- "map_at_5": 0.56838,
598
- "map_at_10": 0.60781,
599
- "map_at_20": 0.62312,
600
- "map_at_50": 0.63095,
601
- "map_at_100": 0.63309,
602
- "recall_at_1": 0.4007,
603
- "recall_at_3": 0.5939,
604
- "recall_at_5": 0.67536,
605
- "recall_at_10": 0.78359,
606
- "recall_at_20": 0.85117,
607
- "recall_at_50": 0.91501,
608
- "recall_at_100": 0.9461,
609
  "precision_at_1": 0.66875,
610
- "precision_at_3": 0.38542,
611
- "precision_at_5": 0.29375,
612
- "precision_at_10": 0.19125,
613
- "precision_at_20": 0.11063,
614
- "precision_at_50": 0.05187,
615
- "precision_at_100": 0.02794,
616
- "mrr_at_1": 0.65625,
617
- "mrr_at_3": 0.7458333333333331,
618
- "mrr_at_5": 0.7564583333333331,
619
- "mrr_at_10": 0.7618030753968252,
620
- "mrr_at_20": 0.7638654799040827,
621
- "mrr_at_50": 0.7643048738434767,
622
- "mrr_at_100": 0.7644090405101435,
623
- "naucs_at_1_max": 0.30756403626346956,
624
- "naucs_at_1_std": -0.08408354529239741,
625
- "naucs_at_1_diff1": 0.4076474126922141,
626
- "naucs_at_3_max": 0.07090535127918343,
627
- "naucs_at_3_std": 0.07019913294946659,
628
- "naucs_at_3_diff1": -0.11532493775484426,
629
- "naucs_at_5_max": -0.10528352726834586,
630
- "naucs_at_5_std": 0.009354154420221325,
631
- "naucs_at_5_diff1": -0.24880882401264726,
632
- "naucs_at_10_max": -0.14362283829712982,
633
- "naucs_at_10_std": -0.00400331274744239,
634
- "naucs_at_10_diff1": -0.2469913716199774,
635
- "naucs_at_20_max": -0.16803539193002953,
636
- "naucs_at_20_std": 0.014308541119453818,
637
- "naucs_at_20_diff1": -0.2642816490417616,
638
- "naucs_at_50_max": -0.2187145395937884,
639
- "naucs_at_50_std": 0.06759328457586304,
640
- "naucs_at_50_diff1": -0.27016007321691166,
641
- "naucs_at_100_max": -0.24400788682123695,
642
- "naucs_at_100_std": 0.08084279607961682,
643
- "naucs_at_100_diff1": -0.2754889685246749
644
  },
645
  "vidore/synthetic_economics_macro_economy_2024_filtered_v1.0": {
646
- "ndcg_at_1": 0.62069,
647
- "ndcg_at_3": 0.6092,
648
- "ndcg_at_5": 0.59592,
649
- "ndcg_at_10": 0.57743,
650
- "ndcg_at_20": 0.60932,
651
- "ndcg_at_50": 0.67311,
652
- "ndcg_at_100": 0.71364,
653
- "map_at_1": 0.08389,
654
- "map_at_3": 0.19085,
655
- "map_at_5": 0.26062,
656
- "map_at_10": 0.32998,
657
- "map_at_20": 0.39499,
658
- "map_at_50": 0.4528,
659
- "map_at_100": 0.48214,
660
- "recall_at_1": 0.08389,
661
- "recall_at_3": 0.23451,
662
- "recall_at_5": 0.34383,
663
- "recall_at_10": 0.46383,
664
- "recall_at_20": 0.61979,
665
- "recall_at_50": 0.80662,
666
- "recall_at_100": 0.92438,
667
- "precision_at_1": 0.62069,
668
- "precision_at_3": 0.57471,
669
- "precision_at_5": 0.53103,
670
- "precision_at_10": 0.42586,
671
- "precision_at_20": 0.32328,
672
  "precision_at_50": 0.19759,
673
- "precision_at_100": 0.12931,
674
- "mrr_at_1": 0.6206896551724138,
675
- "mrr_at_3": 0.7528735632183907,
676
- "mrr_at_5": 0.7683908045977013,
677
- "mrr_at_10": 0.7683908045977013,
678
- "mrr_at_20": 0.7696223316912972,
679
- "mrr_at_50": 0.7703719568787035,
680
- "mrr_at_100": 0.7703719568787035,
681
- "naucs_at_1_max": 0.17281905871299316,
682
- "naucs_at_1_std": 0.27905569883146514,
683
- "naucs_at_1_diff1": 0.08408668436821552,
684
- "naucs_at_3_max": 0.10011229417523333,
685
- "naucs_at_3_std": 0.16636937514765232,
686
- "naucs_at_3_diff1": -0.012050364944396099,
687
- "naucs_at_5_max": 0.1464930806524891,
688
- "naucs_at_5_std": 0.1745709653462463,
689
- "naucs_at_5_diff1": -0.029613513564870166,
690
- "naucs_at_10_max": 0.17008192605864111,
691
- "naucs_at_10_std": 0.2625597911601204,
692
- "naucs_at_10_diff1": -0.01001710875601573,
693
- "naucs_at_20_max": 0.10235282326738067,
694
- "naucs_at_20_std": 0.2133598628675844,
695
- "naucs_at_20_diff1": 0.018116554485469376,
696
- "naucs_at_50_max": 0.003469746317767848,
697
- "naucs_at_50_std": 0.1780522203133001,
698
- "naucs_at_50_diff1": -0.033424370133009036,
699
- "naucs_at_100_max": -0.09027964384749197,
700
- "naucs_at_100_std": 0.08216836863021816,
701
- "naucs_at_100_diff1": -0.034970785400879036
702
  },
703
  "vidore/synthetic_rse_restaurant_filtered_v1.0": {
704
  "ndcg_at_1": 0.52632,
705
- "ndcg_at_3": 0.55612,
706
- "ndcg_at_5": 0.58127,
707
- "ndcg_at_10": 0.61632,
708
- "ndcg_at_20": 0.65128,
709
- "ndcg_at_50": 0.68849,
710
- "ndcg_at_100": 0.69812,
711
- "map_at_1": 0.26028,
712
- "map_at_3": 0.40415,
713
- "map_at_5": 0.45641,
714
- "map_at_10": 0.49836,
715
- "map_at_20": 0.51815,
716
- "map_at_50": 0.5376,
717
- "map_at_100": 0.54301,
718
- "recall_at_1": 0.26028,
719
- "recall_at_3": 0.52421,
720
- "recall_at_5": 0.62776,
721
- "recall_at_10": 0.73759,
722
- "recall_at_20": 0.85073,
723
- "recall_at_50": 0.95624,
724
- "recall_at_100": 0.98026,
725
  "precision_at_1": 0.52632,
726
- "precision_at_3": 0.38012,
727
- "precision_at_5": 0.29825,
728
- "precision_at_10": 0.19825,
729
- "precision_at_20": 0.12456,
730
- "precision_at_50": 0.06982,
731
- "precision_at_100": 0.0386,
732
- "mrr_at_1": 0.47368421052631576,
733
  "mrr_at_3": 0.6403508771929824,
734
- "mrr_at_5": 0.6473684210526315,
735
- "mrr_at_10": 0.6510721247563352,
736
- "mrr_at_20": 0.65645319592688,
737
- "mrr_at_50": 0.656969191798913,
738
- "mrr_at_100": 0.656969191798913,
739
- "naucs_at_1_max": -0.3089087067237991,
740
- "naucs_at_1_std": -0.3620464461508103,
741
- "naucs_at_1_diff1": 0.26950692327391307,
742
- "naucs_at_3_max": -0.2207300460743008,
743
- "naucs_at_3_std": -0.06741698035051424,
744
- "naucs_at_3_diff1": 0.25979433023582205,
745
- "naucs_at_5_max": -0.1447362086814054,
746
- "naucs_at_5_std": -0.03026829152849858,
747
- "naucs_at_5_diff1": 0.20242934642758445,
748
- "naucs_at_10_max": -0.19458395196496367,
749
- "naucs_at_10_std": 0.014281524836361133,
750
- "naucs_at_10_diff1": 0.0777278953601748,
751
- "naucs_at_20_max": -0.29118824877525396,
752
- "naucs_at_20_std": -0.020903684805961418,
753
- "naucs_at_20_diff1": -0.10611639351967059,
754
- "naucs_at_50_max": -0.3569642568932217,
755
- "naucs_at_50_std": -0.050538218015740266,
756
- "naucs_at_50_diff1": -0.20453025213923745,
757
- "naucs_at_100_max": -0.3449115539174466,
758
- "naucs_at_100_std": -0.006597932224841444,
759
- "naucs_at_100_diff1": -0.23912028985395903
760
  },
761
  "vidore/synthetic_axa_filtered_v1.0": {
762
- "ndcg_at_1": 0.55556,
763
- "ndcg_at_3": 0.65245,
764
- "ndcg_at_5": 0.67422,
765
- "ndcg_at_10": 0.69412,
766
- "ndcg_at_20": 0.72356,
767
- "ndcg_at_50": 0.76083,
768
- "ndcg_at_100": 0.77012,
769
- "map_at_1": 0.33205,
770
- "map_at_3": 0.4427,
771
- "map_at_5": 0.51242,
772
- "map_at_10": 0.58092,
773
- "map_at_20": 0.60853,
774
- "map_at_50": 0.63185,
775
- "map_at_100": 0.63523,
776
- "recall_at_1": 0.33205,
777
- "recall_at_3": 0.53629,
778
- "recall_at_5": 0.63362,
779
- "recall_at_10": 0.76473,
780
- "recall_at_20": 0.87608,
781
- "recall_at_50": 0.9613,
782
  "recall_at_100": 0.98765,
783
- "precision_at_1": 0.55556,
784
- "precision_at_3": 0.44444,
785
- "precision_at_5": 0.4,
786
- "precision_at_10": 0.28889,
787
- "precision_at_20": 0.17778,
788
- "precision_at_50": 0.08778,
789
  "precision_at_100": 0.04667,
790
  "mrr_at_1": 0.5555555555555556,
791
- "mrr_at_3": 0.7037037037037037,
792
- "mrr_at_5": 0.7175925925925926,
793
- "mrr_at_10": 0.7175925925925926,
794
- "mrr_at_20": 0.7212962962962962,
795
- "mrr_at_50": 0.7212962962962962,
796
- "mrr_at_100": 0.7212962962962962,
797
- "naucs_at_1_max": 0.273325424755173,
798
- "naucs_at_1_std": 0.30910760464389075,
799
- "naucs_at_1_diff1": 0.4150916518174012,
800
- "naucs_at_3_max": -0.6137789179273527,
801
- "naucs_at_3_std": -0.4695290775057839,
802
- "naucs_at_3_diff1": -0.18728844531342823,
803
- "naucs_at_5_max": -0.621465701003557,
804
- "naucs_at_5_std": -0.48529226853495566,
805
- "naucs_at_5_diff1": -0.2895489481138995,
806
- "naucs_at_10_max": -0.6844616087512149,
807
- "naucs_at_10_std": -0.5491938342713693,
808
- "naucs_at_10_diff1": -0.3997842666069631,
809
- "naucs_at_20_max": -0.7369836153031677,
810
- "naucs_at_20_std": -0.5537041155977006,
811
- "naucs_at_20_diff1": -0.4671308441713653,
812
- "naucs_at_50_max": -0.7286409969333564,
813
- "naucs_at_50_std": -0.5274202789653719,
814
- "naucs_at_50_diff1": -0.436336242874538,
815
- "naucs_at_100_max": -0.7196786978149535,
816
- "naucs_at_100_std": -0.5392516201828518,
817
- "naucs_at_100_diff1": -0.41765856148039054
818
  },
819
  "vidore/synthetic_rse_restaurant_filtered_v1.0_multilingual": {
820
- "ndcg_at_1": 0.49561,
821
- "ndcg_at_3": 0.52255,
822
- "ndcg_at_5": 0.56941,
823
- "ndcg_at_10": 0.60986,
824
- "ndcg_at_20": 0.64548,
825
- "ndcg_at_50": 0.67753,
826
- "ndcg_at_100": 0.69032,
827
- "map_at_1": 0.25293,
828
- "map_at_3": 0.38225,
829
- "map_at_5": 0.44555,
830
- "map_at_10": 0.49002,
831
- "map_at_20": 0.51143,
832
- "map_at_50": 0.52918,
833
- "map_at_100": 0.53512,
834
- "recall_at_1": 0.25293,
835
- "recall_at_3": 0.48103,
836
- "recall_at_5": 0.62079,
837
- "recall_at_10": 0.74725,
838
- "recall_at_20": 0.86193,
839
- "recall_at_50": 0.94719,
840
- "recall_at_100": 0.98575,
841
- "precision_at_1": 0.49561,
842
- "precision_at_3": 0.35965,
843
- "precision_at_5": 0.3,
844
- "precision_at_10": 0.20044,
845
- "precision_at_20": 0.12566,
846
- "precision_at_50": 0.06921,
847
- "precision_at_100": 0.03873,
848
- "mrr_at_1": 0.4605263157894737,
849
- "mrr_at_3": 0.600877192982456,
850
- "mrr_at_5": 0.6201754385964913,
851
- "mrr_at_10": 0.6277899610136451,
852
- "mrr_at_20": 0.632304501054501,
853
- "mrr_at_50": 0.632715568418136,
854
- "mrr_at_100": 0.6328516145519891,
855
- "naucs_at_1_max": -0.24080548401193705,
856
- "naucs_at_1_std": -0.24229129773797134,
857
- "naucs_at_1_diff1": 0.2566309953634961,
858
- "naucs_at_3_max": -0.11573008196728155,
859
- "naucs_at_3_std": -0.007165809143423392,
860
- "naucs_at_3_diff1": 0.148079184923162,
861
- "naucs_at_5_max": -0.1049432299517142,
862
- "naucs_at_5_std": -0.016303988468535046,
863
- "naucs_at_5_diff1": 0.09619296953820566,
864
- "naucs_at_10_max": -0.12738872206780832,
865
- "naucs_at_10_std": 0.0342733554094555,
866
- "naucs_at_10_diff1": -0.0031293439071024384,
867
- "naucs_at_20_max": -0.16828230912628597,
868
- "naucs_at_20_std": 0.03349608653409504,
869
- "naucs_at_20_diff1": -0.1295387505122871,
870
- "naucs_at_50_max": -0.22669113877952793,
871
- "naucs_at_50_std": 0.02892479553383963,
872
- "naucs_at_50_diff1": -0.23893906138517593,
873
- "naucs_at_100_max": -0.21963385965258986,
874
- "naucs_at_100_std": 0.07333502866946137,
875
- "naucs_at_100_diff1": -0.2789374899236579
876
  },
877
  "vidore/synthetic_axa_filtered_v1.0_multilingual": {
878
- "ndcg_at_1": 0.47222,
879
- "ndcg_at_3": 0.57264,
880
- "ndcg_at_5": 0.59702,
881
- "ndcg_at_10": 0.62186,
882
- "ndcg_at_20": 0.64871,
883
- "ndcg_at_50": 0.69211,
884
- "ndcg_at_100": 0.70854,
885
- "map_at_1": 0.26287,
886
- "map_at_3": 0.38199,
887
- "map_at_5": 0.44875,
888
- "map_at_10": 0.50906,
889
- "map_at_20": 0.53467,
890
- "map_at_50": 0.55691,
891
- "map_at_100": 0.56112,
892
- "recall_at_1": 0.26287,
893
- "recall_at_3": 0.48608,
894
- "recall_at_5": 0.59926,
895
- "recall_at_10": 0.72177,
896
- "recall_at_20": 0.8083,
897
- "recall_at_50": 0.91362,
898
- "recall_at_100": 0.97557,
899
- "precision_at_1": 0.47222,
900
- "precision_at_3": 0.39815,
901
- "precision_at_5": 0.35278,
902
- "precision_at_10": 0.25417,
903
- "precision_at_20": 0.15972,
904
- "precision_at_50": 0.08306,
905
- "precision_at_100": 0.04514,
906
- "mrr_at_1": 0.4722222222222222,
907
- "mrr_at_3": 0.6226851851851852,
908
- "mrr_at_5": 0.6386574074074074,
909
- "mrr_at_10": 0.6421296296296296,
910
- "mrr_at_20": 0.6446441539578793,
911
- "mrr_at_50": 0.6453055296192551,
912
- "mrr_at_100": 0.6457492415868007,
913
- "naucs_at_1_max": 0.2529652438308716,
914
- "naucs_at_1_std": 0.2770911675122595,
915
- "naucs_at_1_diff1": 0.3332157173129581,
916
- "naucs_at_3_max": -0.2624572173683075,
917
- "naucs_at_3_std": -0.1647564876291855,
918
- "naucs_at_3_diff1": -0.23020181455626973,
919
- "naucs_at_5_max": -0.3312941171013394,
920
- "naucs_at_5_std": -0.22957466421975894,
921
- "naucs_at_5_diff1": -0.2968823812573124,
922
- "naucs_at_10_max": -0.4218539955783105,
923
- "naucs_at_10_std": -0.24161976745630986,
924
- "naucs_at_10_diff1": -0.35725981793366396,
925
- "naucs_at_20_max": -0.46067265603833846,
926
- "naucs_at_20_std": -0.21758871325930773,
927
- "naucs_at_20_diff1": -0.4361139734376689,
928
- "naucs_at_50_max": -0.5078470632319775,
929
- "naucs_at_50_std": -0.22145210943780846,
930
- "naucs_at_50_diff1": -0.4588177782881521,
931
- "naucs_at_100_max": -0.5221289875651529,
932
- "naucs_at_100_std": -0.2612386533865634,
933
- "naucs_at_100_diff1": -0.4622709868852015
934
  },
935
  "vidore/synthetic_mit_biomedical_tissue_interactions_unfiltered_multilingual": {
936
- "ndcg_at_1": 0.63438,
937
- "ndcg_at_3": 0.62201,
938
- "ndcg_at_5": 0.64286,
939
- "ndcg_at_10": 0.67506,
940
- "ndcg_at_20": 0.70183,
941
- "ndcg_at_50": 0.72203,
942
- "ndcg_at_100": 0.73249,
943
- "map_at_1": 0.37943,
944
- "map_at_3": 0.50859,
945
- "map_at_5": 0.54911,
946
- "map_at_10": 0.58498,
947
- "map_at_20": 0.60163,
948
- "map_at_50": 0.60896,
949
- "map_at_100": 0.61143,
950
- "recall_at_1": 0.37943,
951
- "recall_at_3": 0.57556,
952
- "recall_at_5": 0.66588,
953
- "recall_at_10": 0.76488,
954
- "recall_at_20": 0.83818,
955
- "recall_at_50": 0.90049,
956
- "recall_at_100": 0.93787,
957
- "precision_at_1": 0.63438,
958
- "precision_at_3": 0.38021,
959
- "precision_at_5": 0.28844,
960
- "precision_at_10": 0.18469,
961
- "precision_at_20": 0.10945,
962
- "precision_at_50": 0.05062,
963
- "precision_at_100": 0.02764,
964
- "mrr_at_1": 0.628125,
965
- "mrr_at_3": 0.7169270833333328,
966
- "mrr_at_5": 0.7309114583333327,
967
- "mrr_at_10": 0.7371614583333328,
968
- "mrr_at_20": 0.7390357216997715,
969
- "mrr_at_50": 0.7397633603096262,
970
- "mrr_at_100": 0.7398304785536334,
971
- "naucs_at_1_max": 0.1453255594666248,
972
- "naucs_at_1_std": -0.13075943305407295,
973
- "naucs_at_1_diff1": 0.45529697520228624,
974
- "naucs_at_3_max": 0.0405855928262303,
975
- "naucs_at_3_std": -0.08315040203983201,
976
- "naucs_at_3_diff1": -0.06475583733859824,
977
- "naucs_at_5_max": -0.015242921963913872,
978
- "naucs_at_5_std": -0.06714686840424429,
979
- "naucs_at_5_diff1": -0.1905324708199068,
980
- "naucs_at_10_max": -0.08598060050339351,
981
- "naucs_at_10_std": -0.07140040506088999,
982
- "naucs_at_10_diff1": -0.22563239289159312,
983
- "naucs_at_20_max": -0.09063577299474389,
984
- "naucs_at_20_std": -0.01558920763779827,
985
- "naucs_at_20_diff1": -0.2565930359630854,
986
- "naucs_at_50_max": -0.11508082629877682,
987
- "naucs_at_50_std": 0.019171088523963847,
988
- "naucs_at_50_diff1": -0.2618590007895247,
989
- "naucs_at_100_max": -0.13214324446485648,
990
- "naucs_at_100_std": 0.03196841407290063,
991
- "naucs_at_100_diff1": -0.2604190397720941
992
  },
993
  "vidore/synthetics_economics_macro_economy_2024_filtered_v1.0_multilingual": {
994
- "ndcg_at_1": 0.54741,
995
- "ndcg_at_3": 0.5599,
996
- "ndcg_at_5": 0.54853,
997
- "ndcg_at_10": 0.545,
998
- "ndcg_at_20": 0.5783,
999
- "ndcg_at_50": 0.65023,
1000
- "ndcg_at_100": 0.68802,
1001
- "map_at_1": 0.06945,
1002
- "map_at_3": 0.17107,
1003
- "map_at_5": 0.22726,
1004
- "map_at_10": 0.29871,
1005
- "map_at_20": 0.36215,
1006
- "map_at_50": 0.424,
1007
- "map_at_100": 0.45027,
1008
- "recall_at_1": 0.06945,
1009
- "recall_at_3": 0.23305,
1010
- "recall_at_5": 0.32172,
1011
- "recall_at_10": 0.44954,
1012
- "recall_at_20": 0.60295,
1013
- "recall_at_50": 0.80472,
1014
- "recall_at_100": 0.91982,
1015
- "precision_at_1": 0.54741,
1016
- "precision_at_3": 0.53305,
1017
- "precision_at_5": 0.49397,
1018
- "precision_at_10": 0.40991,
1019
- "precision_at_20": 0.31358,
1020
- "precision_at_50": 0.19905,
1021
- "precision_at_100": 0.12728,
1022
- "mrr_at_1": 0.5474137931034483,
1023
- "mrr_at_3": 0.704022988505747,
1024
- "mrr_at_5": 0.7195402298850573,
1025
- "mrr_at_10": 0.7225933908045975,
1026
- "mrr_at_20": 0.7235478243021344,
1027
- "mrr_at_50": 0.7237352305989859,
1028
- "mrr_at_100": 0.7237352305989859,
1029
- "naucs_at_1_max": 0.00998672510834403,
1030
- "naucs_at_1_std": 0.11928244010618376,
1031
- "naucs_at_1_diff1": 0.18435607419736183,
1032
- "naucs_at_3_max": 0.09061238892019206,
1033
- "naucs_at_3_std": 0.16830880934007303,
1034
- "naucs_at_3_diff1": 0.12321929418645422,
1035
- "naucs_at_5_max": 0.06486164706859976,
1036
- "naucs_at_5_std": 0.12775077647982017,
1037
- "naucs_at_5_diff1": 0.1277592323026615,
1038
- "naucs_at_10_max": 0.08518832038722114,
1039
- "naucs_at_10_std": 0.16693423956201361,
1040
- "naucs_at_10_diff1": 0.031238610885063946,
1041
- "naucs_at_20_max": 0.09752587635034847,
1042
- "naucs_at_20_std": 0.20072724900497335,
1043
- "naucs_at_20_diff1": 0.018717834286324002,
1044
- "naucs_at_50_max": 0.03561724408666398,
1045
- "naucs_at_50_std": 0.15990917660678955,
1046
- "naucs_at_50_diff1": -0.06346129978133774,
1047
- "naucs_at_100_max": -0.028342706470580726,
1048
- "naucs_at_100_std": 0.09244086442839103,
1049
- "naucs_at_100_diff1": -0.07780688597803101
1050
  },
1051
  "vidore/restaurant_esg_reports_beir": {
1052
- "ndcg_at_1": 0.64103,
1053
- "ndcg_at_3": 0.64351,
1054
- "ndcg_at_5": 0.67059,
1055
- "ndcg_at_10": 0.70704,
1056
- "ndcg_at_20": 0.72722,
1057
- "ndcg_at_50": 0.74931,
1058
- "ndcg_at_100": 0.7528,
1059
- "map_at_1": 0.42885,
1060
- "map_at_3": 0.56026,
1061
- "map_at_5": 0.60129,
1062
- "map_at_10": 0.62973,
1063
- "map_at_20": 0.64241,
1064
- "map_at_50": 0.64967,
1065
- "map_at_100": 0.65059,
1066
- "recall_at_1": 0.42885,
1067
- "recall_at_3": 0.63173,
1068
- "recall_at_5": 0.7295,
1069
- "recall_at_10": 0.81704,
1070
- "recall_at_20": 0.8714,
1071
- "recall_at_50": 0.95845,
1072
- "recall_at_100": 0.9711,
1073
- "precision_at_1": 0.65385,
1074
- "precision_at_3": 0.35897,
1075
- "precision_at_5": 0.26923,
1076
- "precision_at_10": 0.16346,
1077
- "precision_at_20": 0.09519,
1078
- "precision_at_50": 0.04346,
1079
- "precision_at_100": 0.0225,
1080
- "mrr_at_1": 0.6730769230769231,
1081
- "mrr_at_3": 0.7339743589743589,
1082
- "mrr_at_5": 0.7483974358974359,
1083
- "mrr_at_10": 0.7558150183150183,
1084
- "mrr_at_20": 0.7589368964368963,
1085
- "mrr_at_50": 0.7600091917131535,
1086
- "mrr_at_100": 0.7600091917131535,
1087
- "naucs_at_1_max": 0.16031697778241094,
1088
- "naucs_at_1_std": 0.03122729708497878,
1089
- "naucs_at_1_diff1": 0.691822071573433,
1090
- "naucs_at_3_max": 0.08627887141319174,
1091
- "naucs_at_3_std": 0.016217193776361664,
1092
- "naucs_at_3_diff1": -0.005764084746853097,
1093
- "naucs_at_5_max": 0.12150676334655493,
1094
- "naucs_at_5_std": 0.1459461309588857,
1095
- "naucs_at_5_diff1": -0.11774835728053674,
1096
- "naucs_at_10_max": 0.007908068459048687,
1097
- "naucs_at_10_std": 0.17745560826730747,
1098
- "naucs_at_10_diff1": -0.23754271742179867,
1099
- "naucs_at_20_max": -0.0740960904448303,
1100
- "naucs_at_20_std": 0.17166248162182116,
1101
- "naucs_at_20_diff1": -0.3126851801378383,
1102
- "naucs_at_50_max": -0.13339720566722635,
1103
- "naucs_at_50_std": 0.13542366187464194,
1104
- "naucs_at_50_diff1": -0.35317283000026634,
1105
- "naucs_at_100_max": -0.16612439675549276,
1106
- "naucs_at_100_std": 0.10724055764337584,
1107
- "naucs_at_100_diff1": -0.35158659929056096
1108
  }
1109
  }
1110
  }
 
1
  {
2
  "metadata": {
3
+ "timestamp": "2025-03-31T23:54:56.140130",
4
  "vidore_benchmark_version": "5.0.1.dev6+g9e0da63"
5
  },
6
  "metrics": {
7
  "vidore/arxivqa_test_subsampled": {
8
+ "ndcg_at_1": 0.83,
9
+ "ndcg_at_3": 0.87533,
10
+ "ndcg_at_5": 0.88678,
11
+ "ndcg_at_10": 0.89447,
12
+ "ndcg_at_20": 0.89988,
13
+ "ndcg_at_50": 0.90114,
14
+ "ndcg_at_100": 0.90247,
15
+ "map_at_1": 0.83,
16
+ "map_at_3": 0.86467,
17
+ "map_at_5": 0.87097,
18
+ "map_at_10": 0.87409,
19
+ "map_at_20": 0.87551,
20
+ "map_at_50": 0.87574,
21
+ "map_at_100": 0.87587,
22
+ "recall_at_1": 0.83,
23
+ "recall_at_3": 0.906,
24
+ "recall_at_5": 0.934,
25
  "recall_at_10": 0.958,
26
+ "recall_at_20": 0.98,
27
+ "recall_at_50": 0.986,
28
+ "recall_at_100": 0.994,
29
+ "precision_at_1": 0.83,
30
+ "precision_at_3": 0.302,
31
+ "precision_at_5": 0.1868,
32
  "precision_at_10": 0.0958,
33
+ "precision_at_20": 0.049,
34
+ "precision_at_50": 0.01972,
35
+ "precision_at_100": 0.00994,
36
+ "mrr_at_1": 0.832,
37
+ "mrr_at_3": 0.8663333333333333,
38
+ "mrr_at_5": 0.8716333333333333,
39
+ "mrr_at_10": 0.8752095238095239,
40
+ "mrr_at_20": 0.8764481089945486,
41
+ "mrr_at_50": 0.8769236116400513,
42
+ "mrr_at_100": 0.8770087474728994,
43
+ "naucs_at_1_max": 0.7799094444766933,
44
+ "naucs_at_1_std": -0.08580135833284933,
45
+ "naucs_at_1_diff1": 0.9595692807801705,
46
+ "naucs_at_3_max": 0.7955579394878525,
47
+ "naucs_at_3_std": -0.11155809841667234,
48
+ "naucs_at_3_diff1": 0.9224725351133368,
49
+ "naucs_at_5_max": 0.7705061822708856,
50
+ "naucs_at_5_std": -0.25026172084995735,
51
+ "naucs_at_5_diff1": 0.9217666864725705,
52
+ "naucs_at_10_max": 0.7739540260548654,
53
+ "naucs_at_10_std": -0.21324085189631833,
54
+ "naucs_at_10_diff1": 0.9221910986616823,
55
+ "naucs_at_20_max": 0.7708216619981358,
56
+ "naucs_at_20_std": -0.0996732026143794,
57
+ "naucs_at_20_diff1": 0.8905228758169942,
58
+ "naucs_at_50_max": 0.816593303988261,
59
+ "naucs_at_50_std": -0.19987995198079136,
60
+ "naucs_at_50_diff1": 0.8832866479925231,
61
+ "naucs_at_100_max": 0.9564270152505304,
62
+ "naucs_at_100_std": 0.04310613134141329,
63
+ "naucs_at_100_diff1": 0.9564270152505304
64
  },
65
  "vidore/docvqa_test_subsampled": {
66
+ "ndcg_at_1": 0.5388,
67
+ "ndcg_at_3": 0.59542,
68
+ "ndcg_at_5": 0.61269,
69
+ "ndcg_at_10": 0.62919,
70
+ "ndcg_at_20": 0.64259,
71
+ "ndcg_at_50": 0.64929,
72
+ "ndcg_at_100": 0.65683,
73
+ "map_at_1": 0.5388,
74
+ "map_at_3": 0.5813,
75
+ "map_at_5": 0.59084,
76
+ "map_at_10": 0.59764,
77
+ "map_at_20": 0.6013,
78
+ "map_at_50": 0.60241,
79
+ "map_at_100": 0.60307,
80
+ "recall_at_1": 0.5388,
81
  "recall_at_3": 0.63636,
82
+ "recall_at_5": 0.67849,
83
+ "recall_at_10": 0.72949,
84
+ "recall_at_20": 0.78271,
85
+ "recall_at_50": 0.81596,
86
+ "recall_at_100": 0.86253,
87
+ "precision_at_1": 0.5388,
88
  "precision_at_3": 0.21212,
89
+ "precision_at_5": 0.1357,
90
+ "precision_at_10": 0.07295,
91
+ "precision_at_20": 0.03914,
92
+ "precision_at_50": 0.01632,
93
+ "precision_at_100": 0.00863,
94
+ "mrr_at_1": 0.5432372505543237,
95
+ "mrr_at_3": 0.5824094604582412,
96
+ "mrr_at_5": 0.5903917220990392,
97
+ "mrr_at_10": 0.5980097138633723,
98
+ "mrr_at_20": 0.601820515422298,
99
+ "mrr_at_50": 0.6029928998761527,
100
+ "mrr_at_100": 0.6035667141572244,
101
+ "naucs_at_1_max": 0.8430735352775856,
102
+ "naucs_at_1_std": 0.5939128313043085,
103
+ "naucs_at_1_diff1": 0.9018027917306838,
104
+ "naucs_at_3_max": 0.8808230124613547,
105
+ "naucs_at_3_std": 0.712681445860455,
106
+ "naucs_at_3_diff1": 0.8646220134947054,
107
+ "naucs_at_5_max": 0.8912698205476521,
108
+ "naucs_at_5_std": 0.7525159249929689,
109
+ "naucs_at_5_diff1": 0.8414611325660849,
110
+ "naucs_at_10_max": 0.8852509111743405,
111
+ "naucs_at_10_std": 0.817204422380503,
112
+ "naucs_at_10_diff1": 0.8264628195305581,
113
+ "naucs_at_20_max": 0.8512017211379178,
114
+ "naucs_at_20_std": 0.8215729110215341,
115
+ "naucs_at_20_diff1": 0.7992889048812711,
116
+ "naucs_at_50_max": 0.8402551716801772,
117
+ "naucs_at_50_std": 0.8733758748271155,
118
+ "naucs_at_50_diff1": 0.7960833079573262,
119
+ "naucs_at_100_max": 0.838262046357413,
120
+ "naucs_at_100_std": 0.904184740422419,
121
+ "naucs_at_100_diff1": 0.7996865824491111
122
  },
123
  "vidore/infovqa_test_subsampled": {
124
+ "ndcg_at_1": 0.90081,
125
+ "ndcg_at_3": 0.92934,
126
+ "ndcg_at_5": 0.93352,
127
+ "ndcg_at_10": 0.93669,
128
+ "ndcg_at_20": 0.93871,
129
+ "ndcg_at_50": 0.9403,
130
+ "ndcg_at_100": 0.94099,
131
+ "map_at_1": 0.90081,
132
+ "map_at_3": 0.9224,
133
+ "map_at_5": 0.92473,
134
+ "map_at_10": 0.92598,
135
+ "map_at_20": 0.92652,
136
+ "map_at_50": 0.92677,
137
+ "map_at_100": 0.92684,
138
+ "recall_at_1": 0.90081,
139
+ "recall_at_3": 0.94939,
140
+ "recall_at_5": 0.95951,
141
+ "recall_at_10": 0.96964,
142
+ "recall_at_20": 0.97773,
143
+ "recall_at_50": 0.98583,
144
+ "recall_at_100": 0.98988,
145
+ "precision_at_1": 0.90081,
146
+ "precision_at_3": 0.31646,
147
+ "precision_at_5": 0.1919,
148
+ "precision_at_10": 0.09696,
149
+ "precision_at_20": 0.04889,
150
+ "precision_at_50": 0.01972,
151
+ "precision_at_100": 0.0099,
152
+ "mrr_at_1": 0.9008097165991903,
153
+ "mrr_at_3": 0.9210526315789472,
154
+ "mrr_at_5": 0.9248987854251012,
155
+ "mrr_at_10": 0.9258040935672512,
156
+ "mrr_at_20": 0.926239353314252,
157
+ "mrr_at_50": 0.9265337558342984,
158
+ "mrr_at_100": 0.9266429029872886,
159
+ "naucs_at_1_max": 0.666687407456454,
160
+ "naucs_at_1_std": 0.04348917328850024,
161
+ "naucs_at_1_diff1": 0.9460185154989313,
162
+ "naucs_at_3_max": 0.887692938235415,
163
+ "naucs_at_3_std": 0.3724699632820944,
164
+ "naucs_at_3_diff1": 0.9582075346992748,
165
+ "naucs_at_5_max": 0.9121353932948266,
166
+ "naucs_at_5_std": 0.46274982003896187,
167
+ "naucs_at_5_diff1": 0.9608195637805694,
168
+ "naucs_at_10_max": 0.9553640600031791,
169
+ "naucs_at_10_std": 0.6527520622928011,
170
+ "naucs_at_10_diff1": 0.9564661819784096,
171
+ "naucs_at_20_max": 0.9643814216187027,
172
+ "naucs_at_20_std": 0.8342696288850537,
173
+ "naucs_at_20_diff1": 0.9643814216187027,
174
+ "naucs_at_50_max": 1.0,
175
+ "naucs_at_50_std": 0.960323608891707,
176
+ "naucs_at_50_diff1": 0.9626852988386237,
177
+ "naucs_at_100_max": 1.0,
178
+ "naucs_at_100_std": 1.0,
179
+ "naucs_at_100_diff1": 0.9477594183740937
180
  },
181
  "vidore/tabfquad_test_subsampled": {
182
+ "ndcg_at_1": 0.925,
183
+ "ndcg_at_3": 0.95561,
184
+ "ndcg_at_5": 0.95991,
185
+ "ndcg_at_10": 0.96464,
186
+ "ndcg_at_20": 0.96464,
187
+ "ndcg_at_50": 0.96464,
188
+ "ndcg_at_100": 0.96464,
189
+ "map_at_1": 0.925,
190
+ "map_at_3": 0.94881,
191
+ "map_at_5": 0.95113,
192
+ "map_at_10": 0.95314,
193
+ "map_at_20": 0.95314,
194
+ "map_at_50": 0.95314,
195
+ "map_at_100": 0.95314,
196
+ "recall_at_1": 0.925,
197
+ "recall_at_3": 0.975,
198
+ "recall_at_5": 0.98571,
199
+ "recall_at_10": 1.0,
200
  "recall_at_20": 1.0,
201
  "recall_at_50": 1.0,
202
  "recall_at_100": 1.0,
203
+ "precision_at_1": 0.925,
204
+ "precision_at_3": 0.325,
205
+ "precision_at_5": 0.19714,
206
+ "precision_at_10": 0.1,
207
  "precision_at_20": 0.05,
208
  "precision_at_50": 0.02,
209
  "precision_at_100": 0.01,
210
+ "mrr_at_1": 0.9214285714285714,
211
+ "mrr_at_3": 0.9464285714285714,
212
+ "mrr_at_5": 0.9487499999999999,
213
+ "mrr_at_10": 0.9507624716553289,
214
+ "mrr_at_20": 0.9507624716553289,
215
+ "mrr_at_50": 0.9507624716553289,
216
+ "mrr_at_100": 0.9507624716553289,
217
+ "naucs_at_1_max": 0.5344804588502072,
218
+ "naucs_at_1_std": 0.26817393624116476,
219
+ "naucs_at_1_diff1": 0.9383086567960518,
220
+ "naucs_at_3_max": 0.9416433239962675,
221
+ "naucs_at_3_std": 0.8989595838335351,
222
+ "naucs_at_3_diff1": 0.98132586367881,
223
+ "naucs_at_5_max": 0.967320261437913,
224
+ "naucs_at_5_std": 0.8558590102707857,
225
+ "naucs_at_5_diff1": 0.967320261437913,
226
  "naucs_at_10_max": 1.0,
227
  "naucs_at_10_std": 1.0,
228
  "naucs_at_10_diff1": 1.0,
 
237
  "naucs_at_100_diff1": 1.0
238
  },
239
  "vidore/tatdqa_test": {
240
+ "ndcg_at_1": 0.70109,
241
+ "ndcg_at_3": 0.79187,
242
+ "ndcg_at_5": 0.81111,
243
+ "ndcg_at_10": 0.82342,
244
+ "ndcg_at_20": 0.8274,
245
+ "ndcg_at_50": 0.83026,
246
+ "ndcg_at_100": 0.83293,
247
+ "map_at_1": 0.70109,
248
+ "map_at_3": 0.77005,
249
+ "map_at_5": 0.78071,
250
+ "map_at_10": 0.78598,
251
+ "map_at_20": 0.78706,
252
+ "map_at_50": 0.78751,
253
+ "map_at_100": 0.78775,
254
+ "recall_at_1": 0.70109,
255
+ "recall_at_3": 0.8548,
256
+ "recall_at_5": 0.90158,
257
+ "recall_at_10": 0.93864,
258
+ "recall_at_20": 0.95443,
259
+ "recall_at_50": 0.96902,
260
+ "recall_at_100": 0.98542,
261
+ "precision_at_1": 0.70109,
262
+ "precision_at_3": 0.28493,
263
+ "precision_at_5": 0.18032,
264
+ "precision_at_10": 0.09386,
265
+ "precision_at_20": 0.04772,
266
+ "precision_at_50": 0.01938,
267
+ "precision_at_100": 0.00985,
268
+ "mrr_at_1": 0.7023086269744836,
269
+ "mrr_at_3": 0.7714661806399354,
270
+ "mrr_at_5": 0.782978938841637,
271
+ "mrr_at_10": 0.7879385330478899,
272
+ "mrr_at_20": 0.7889867733618092,
273
+ "mrr_at_50": 0.7893770561483674,
274
+ "mrr_at_100": 0.7896057919808999,
275
+ "naucs_at_1_max": 0.25184895001768326,
276
+ "naucs_at_1_std": 0.037017257538968965,
277
+ "naucs_at_1_diff1": 0.8508843501202716,
278
+ "naucs_at_3_max": 0.28836101906742934,
279
+ "naucs_at_3_std": 0.0895587034833779,
280
+ "naucs_at_3_diff1": 0.7677154118662606,
281
+ "naucs_at_5_max": 0.33141107758129335,
282
+ "naucs_at_5_std": 0.1402343443047445,
283
+ "naucs_at_5_diff1": 0.726113129605695,
284
+ "naucs_at_10_max": 0.3753484131980775,
285
+ "naucs_at_10_std": 0.21181536222478634,
286
+ "naucs_at_10_diff1": 0.7110698774619854,
287
+ "naucs_at_20_max": 0.398127644970406,
288
+ "naucs_at_20_std": 0.2900002928945751,
289
+ "naucs_at_20_diff1": 0.6944425419244158,
290
+ "naucs_at_50_max": 0.4094042090297014,
291
+ "naucs_at_50_std": 0.3282254760208566,
292
+ "naucs_at_50_diff1": 0.665997592036148,
293
+ "naucs_at_100_max": 0.2601633836764814,
294
+ "naucs_at_100_std": 0.2532897060083396,
295
+ "naucs_at_100_diff1": 0.6808982214080673
296
  },
297
  "vidore/shiftproject_test": {
298
+ "ndcg_at_1": 0.84,
299
+ "ndcg_at_3": 0.91809,
300
+ "ndcg_at_5": 0.91809,
301
+ "ndcg_at_10": 0.92499,
302
+ "ndcg_at_20": 0.92499,
303
+ "ndcg_at_50": 0.92499,
304
+ "ndcg_at_100": 0.92673,
305
+ "map_at_1": 0.84,
306
+ "map_at_3": 0.9,
307
+ "map_at_5": 0.9,
308
+ "map_at_10": 0.9031,
309
+ "map_at_20": 0.9031,
310
+ "map_at_50": 0.9031,
311
+ "map_at_100": 0.90328,
312
+ "recall_at_1": 0.84,
313
+ "recall_at_3": 0.97,
314
+ "recall_at_5": 0.97,
315
  "recall_at_10": 0.99,
316
  "recall_at_20": 0.99,
317
+ "recall_at_50": 0.99,
318
  "recall_at_100": 1.0,
319
+ "precision_at_1": 0.84,
320
+ "precision_at_3": 0.32333,
321
+ "precision_at_5": 0.194,
322
  "precision_at_10": 0.099,
323
  "precision_at_20": 0.0495,
324
+ "precision_at_50": 0.0198,
325
  "precision_at_100": 0.01,
326
+ "mrr_at_1": 0.84,
327
+ "mrr_at_3": 0.8983333333333333,
328
+ "mrr_at_5": 0.8983333333333333,
329
+ "mrr_at_10": 0.9014285714285714,
330
+ "mrr_at_20": 0.9014285714285714,
331
+ "mrr_at_50": 0.9014285714285714,
332
+ "mrr_at_100": 0.9016208791208791,
333
+ "naucs_at_1_max": 0.1685076092292596,
334
+ "naucs_at_1_std": -0.2145925380461453,
335
+ "naucs_at_1_diff1": 0.7751595483554242,
336
+ "naucs_at_3_max": 0.6498599439775861,
337
+ "naucs_at_3_std": 0.36834733893557625,
338
+ "naucs_at_3_diff1": 0.664021164021167,
339
+ "naucs_at_5_max": 0.6498599439775937,
340
+ "naucs_at_5_std": 0.36834733893557176,
341
+ "naucs_at_5_diff1": 0.6640211640211615,
342
+ "naucs_at_10_max": 0.7222222222222276,
343
+ "naucs_at_10_std": 0.5541549953314738,
344
  "naucs_at_10_diff1": 0.8692810457516413,
345
+ "naucs_at_20_max": 0.7222222222222276,
346
+ "naucs_at_20_std": 0.5541549953314738,
347
  "naucs_at_20_diff1": 0.8692810457516413,
348
+ "naucs_at_50_max": 0.7222222222222041,
349
+ "naucs_at_50_std": 0.554154995331464,
350
+ "naucs_at_50_diff1": 0.8692810457516374,
351
  "naucs_at_100_max": null,
352
  "naucs_at_100_std": null,
353
  "naucs_at_100_diff1": null
354
  },
355
  "vidore/syntheticDocQA_artificial_intelligence_test": {
356
+ "ndcg_at_1": 0.98,
357
+ "ndcg_at_3": 0.99262,
358
+ "ndcg_at_5": 0.99262,
359
+ "ndcg_at_10": 0.99262,
360
+ "ndcg_at_20": 0.99262,
361
+ "ndcg_at_50": 0.99262,
362
+ "ndcg_at_100": 0.99262,
363
+ "map_at_1": 0.98,
364
+ "map_at_3": 0.99,
365
+ "map_at_5": 0.99,
366
+ "map_at_10": 0.99,
367
+ "map_at_20": 0.99,
368
+ "map_at_50": 0.99,
369
+ "map_at_100": 0.99,
370
+ "recall_at_1": 0.98,
371
  "recall_at_3": 1.0,
372
  "recall_at_5": 1.0,
373
  "recall_at_10": 1.0,
374
  "recall_at_20": 1.0,
375
  "recall_at_50": 1.0,
376
  "recall_at_100": 1.0,
377
+ "precision_at_1": 0.98,
378
  "precision_at_3": 0.33333,
379
  "precision_at_5": 0.2,
380
  "precision_at_10": 0.1,
381
  "precision_at_20": 0.05,
382
  "precision_at_50": 0.02,
383
  "precision_at_100": 0.01,
384
+ "mrr_at_1": 0.99,
385
+ "mrr_at_3": 0.995,
386
+ "mrr_at_5": 0.995,
387
+ "mrr_at_10": 0.995,
388
+ "mrr_at_20": 0.995,
389
+ "mrr_at_50": 0.995,
390
+ "mrr_at_100": 0.995,
391
+ "naucs_at_1_max": 0.540149393090569,
392
+ "naucs_at_1_std": -0.6909430438842186,
393
  "naucs_at_1_diff1": 1.0,
394
  "naucs_at_3_max": 1.0,
395
  "naucs_at_3_std": 1.0,
 
413
  "vidore/syntheticDocQA_energy_test": {
414
  "ndcg_at_1": 0.95,
415
  "ndcg_at_3": 0.95631,
416
+ "ndcg_at_5": 0.96492,
417
+ "ndcg_at_10": 0.96781,
418
+ "ndcg_at_20": 0.96781,
419
+ "ndcg_at_50": 0.96976,
420
+ "ndcg_at_100": 0.96976,
421
  "map_at_1": 0.95,
422
  "map_at_3": 0.955,
423
+ "map_at_5": 0.96,
424
+ "map_at_10": 0.961,
425
+ "map_at_20": 0.961,
426
+ "map_at_50": 0.96129,
427
+ "map_at_100": 0.96129,
428
  "recall_at_1": 0.95,
429
  "recall_at_3": 0.96,
430
  "recall_at_5": 0.98,
 
441
  "precision_at_100": 0.01,
442
  "mrr_at_1": 0.95,
443
  "mrr_at_3": 0.955,
444
+ "mrr_at_5": 0.96,
445
+ "mrr_at_10": 0.961,
446
+ "mrr_at_20": 0.961,
447
+ "mrr_at_50": 0.9613030303030302,
448
+ "mrr_at_100": 0.9613030303030302,
449
+ "naucs_at_1_max": 0.32362278244631343,
450
+ "naucs_at_1_std": -0.7713352007469629,
451
  "naucs_at_1_diff1": 1.0,
452
+ "naucs_at_3_max": 0.8395191409897237,
453
+ "naucs_at_3_std": -0.6762371615312763,
454
  "naucs_at_3_diff1": 1.0,
455
+ "naucs_at_5_max": 0.6790382819794609,
456
+ "naucs_at_5_std": -1.445845004668519,
457
  "naucs_at_5_diff1": 1.0,
458
+ "naucs_at_10_max": 1.0,
459
  "naucs_at_10_std": -1.1517273576097316,
460
  "naucs_at_10_diff1": 1.0,
461
+ "naucs_at_20_max": 1.0,
462
  "naucs_at_20_std": -1.1517273576097316,
463
  "naucs_at_20_diff1": 1.0,
464
  "naucs_at_50_max": null,
 
469
  "naucs_at_100_diff1": null
470
  },
471
  "vidore/syntheticDocQA_government_reports_test": {
472
+ "ndcg_at_1": 0.91,
473
+ "ndcg_at_3": 0.94655,
474
+ "ndcg_at_5": 0.95085,
475
+ "ndcg_at_10": 0.95743,
476
+ "ndcg_at_20": 0.95743,
477
+ "ndcg_at_50": 0.95743,
478
+ "ndcg_at_100": 0.95743,
479
+ "map_at_1": 0.91,
480
+ "map_at_3": 0.93833,
481
+ "map_at_5": 0.94083,
482
+ "map_at_10": 0.94361,
483
+ "map_at_20": 0.94361,
484
+ "map_at_50": 0.94361,
485
+ "map_at_100": 0.94361,
486
+ "recall_at_1": 0.91,
487
+ "recall_at_3": 0.97,
488
+ "recall_at_5": 0.98,
489
  "recall_at_10": 1.0,
490
  "recall_at_20": 1.0,
491
  "recall_at_50": 1.0,
492
  "recall_at_100": 1.0,
493
+ "precision_at_1": 0.91,
494
+ "precision_at_3": 0.32333,
495
+ "precision_at_5": 0.196,
496
  "precision_at_10": 0.1,
497
  "precision_at_20": 0.05,
498
  "precision_at_50": 0.02,
499
  "precision_at_100": 0.01,
500
+ "mrr_at_1": 0.92,
501
+ "mrr_at_3": 0.9433333333333332,
502
+ "mrr_at_5": 0.9483333333333333,
503
+ "mrr_at_10": 0.9494444444444444,
504
+ "mrr_at_20": 0.9494444444444444,
505
+ "mrr_at_50": 0.9494444444444444,
506
+ "mrr_at_100": 0.9494444444444444,
507
+ "naucs_at_1_max": 0.6774561676522453,
508
+ "naucs_at_1_std": 0.14721444133208894,
509
+ "naucs_at_1_diff1": 0.9564270152505436,
510
+ "naucs_at_3_max": 0.8513849984438244,
511
+ "naucs_at_3_std": -0.24929971988795643,
512
+ "naucs_at_3_diff1": 0.9564270152505466,
513
  "naucs_at_5_max": 1.0,
514
+ "naucs_at_5_std": -0.43534080298785716,
515
+ "naucs_at_5_diff1": 0.9346405228758136,
516
  "naucs_at_10_max": 1.0,
517
  "naucs_at_10_std": 1.0,
518
  "naucs_at_10_diff1": 1.0,
 
562
  "mrr_at_20": 0.99,
563
  "mrr_at_50": 0.99,
564
  "mrr_at_100": 0.99,
565
+ "naucs_at_1_max": 0.7222222222222248,
566
+ "naucs_at_1_std": -0.5634920634920563,
567
  "naucs_at_1_diff1": 1.0,
568
  "naucs_at_3_max": 1.0,
569
  "naucs_at_3_std": 1.0,
 
586
  },
587
  "vidore/synthetic_mit_biomedical_tissue_interactions_unfiltered": {
588
  "ndcg_at_1": 0.66875,
589
+ "ndcg_at_3": 0.64833,
590
+ "ndcg_at_5": 0.66136,
591
+ "ndcg_at_10": 0.7002,
592
+ "ndcg_at_20": 0.72343,
593
+ "ndcg_at_50": 0.74221,
594
+ "ndcg_at_100": 0.7504,
595
+ "map_at_1": 0.41001,
596
+ "map_at_3": 0.53585,
597
+ "map_at_5": 0.57453,
598
+ "map_at_10": 0.61456,
599
+ "map_at_20": 0.63039,
600
+ "map_at_50": 0.63734,
601
+ "map_at_100": 0.63926,
602
+ "recall_at_1": 0.41001,
603
+ "recall_at_3": 0.59207,
604
+ "recall_at_5": 0.66951,
605
+ "recall_at_10": 0.7842,
606
+ "recall_at_20": 0.84612,
607
+ "recall_at_50": 0.90015,
608
+ "recall_at_100": 0.92697,
609
  "precision_at_1": 0.66875,
610
+ "precision_at_3": 0.39375,
611
+ "precision_at_5": 0.29125,
612
+ "precision_at_10": 0.18938,
613
+ "precision_at_20": 0.11,
614
+ "precision_at_50": 0.05087,
615
+ "precision_at_100": 0.0275,
616
+ "mrr_at_1": 0.65,
617
+ "mrr_at_3": 0.7364583333333331,
618
+ "mrr_at_5": 0.7480208333333331,
619
+ "mrr_at_10": 0.7537127976190475,
620
+ "mrr_at_20": 0.7549048402255638,
621
+ "mrr_at_50": 0.755591370159045,
622
+ "mrr_at_100": 0.755591370159045,
623
+ "naucs_at_1_max": 0.3941383204398901,
624
+ "naucs_at_1_std": -0.03446929320007161,
625
+ "naucs_at_1_diff1": 0.4503277010269023,
626
+ "naucs_at_3_max": 0.029009734517562568,
627
+ "naucs_at_3_std": 0.005402681474374929,
628
+ "naucs_at_3_diff1": -0.048706505991403734,
629
+ "naucs_at_5_max": -0.08207834225628742,
630
+ "naucs_at_5_std": -0.0032984001708723115,
631
+ "naucs_at_5_diff1": -0.17283336776461594,
632
+ "naucs_at_10_max": -0.17977285443640584,
633
+ "naucs_at_10_std": -0.04541684253923944,
634
+ "naucs_at_10_diff1": -0.23295672062273604,
635
+ "naucs_at_20_max": -0.170593905587224,
636
+ "naucs_at_20_std": 0.015384406511413186,
637
+ "naucs_at_20_diff1": -0.2551053459454954,
638
+ "naucs_at_50_max": -0.22984435384481694,
639
+ "naucs_at_50_std": 0.04443493398633732,
640
+ "naucs_at_50_diff1": -0.27937793023113594,
641
+ "naucs_at_100_max": -0.25957527630817795,
642
+ "naucs_at_100_std": 0.057081217133354005,
643
+ "naucs_at_100_diff1": -0.3033742727110607
644
  },
645
  "vidore/synthetic_economics_macro_economy_2024_filtered_v1.0": {
646
+ "ndcg_at_1": 0.65517,
647
+ "ndcg_at_3": 0.62816,
648
+ "ndcg_at_5": 0.6159,
649
+ "ndcg_at_10": 0.58912,
650
+ "ndcg_at_20": 0.61706,
651
+ "ndcg_at_50": 0.67765,
652
+ "ndcg_at_100": 0.71471,
653
+ "map_at_1": 0.08673,
654
+ "map_at_3": 0.19894,
655
+ "map_at_5": 0.26716,
656
+ "map_at_10": 0.33292,
657
+ "map_at_20": 0.40089,
658
+ "map_at_50": 0.45667,
659
+ "map_at_100": 0.4844,
660
+ "recall_at_1": 0.08673,
661
+ "recall_at_3": 0.24596,
662
+ "recall_at_5": 0.34869,
663
+ "recall_at_10": 0.4683,
664
+ "recall_at_20": 0.62193,
665
+ "recall_at_50": 0.80907,
666
+ "recall_at_100": 0.92119,
667
+ "precision_at_1": 0.65517,
668
+ "precision_at_3": 0.58621,
669
+ "precision_at_5": 0.55172,
670
+ "precision_at_10": 0.42931,
671
+ "precision_at_20": 0.32672,
672
  "precision_at_50": 0.19759,
673
+ "precision_at_100": 0.1281,
674
+ "mrr_at_1": 0.6896551724137931,
675
+ "mrr_at_3": 0.7902298850574713,
676
+ "mrr_at_5": 0.7971264367816091,
677
+ "mrr_at_10": 0.8015051997810618,
678
+ "mrr_at_20": 0.8015051997810618,
679
+ "mrr_at_50": 0.8023262178434591,
680
+ "mrr_at_100": 0.8023262178434591,
681
+ "naucs_at_1_max": -0.046307070115366736,
682
+ "naucs_at_1_std": 0.15790688174002224,
683
+ "naucs_at_1_diff1": -0.14987305341556462,
684
+ "naucs_at_3_max": 0.11671339502888234,
685
+ "naucs_at_3_std": 0.11834362034275059,
686
+ "naucs_at_3_diff1": -0.17762652230301346,
687
+ "naucs_at_5_max": 0.10679449683484428,
688
+ "naucs_at_5_std": 0.09453878709557821,
689
+ "naucs_at_5_diff1": -0.1792193824994377,
690
+ "naucs_at_10_max": 0.10651036963192122,
691
+ "naucs_at_10_std": 0.207447480653671,
692
+ "naucs_at_10_diff1": -0.23752907816911517,
693
+ "naucs_at_20_max": 0.05980777985340397,
694
+ "naucs_at_20_std": 0.23365933479100473,
695
+ "naucs_at_20_diff1": -0.2025898492844735,
696
+ "naucs_at_50_max": -0.012623885278852142,
697
+ "naucs_at_50_std": 0.2463485456131626,
698
+ "naucs_at_50_diff1": -0.18909262376408734,
699
+ "naucs_at_100_max": -0.08872391317584416,
700
+ "naucs_at_100_std": 0.17948141514389135,
701
+ "naucs_at_100_diff1": -0.186930857132979
702
  },
703
  "vidore/synthetic_rse_restaurant_filtered_v1.0": {
704
  "ndcg_at_1": 0.52632,
705
+ "ndcg_at_3": 0.52504,
706
+ "ndcg_at_5": 0.57289,
707
+ "ndcg_at_10": 0.62003,
708
+ "ndcg_at_20": 0.65505,
709
+ "ndcg_at_50": 0.68979,
710
+ "ndcg_at_100": 0.70204,
711
+ "map_at_1": 0.29474,
712
+ "map_at_3": 0.40396,
713
+ "map_at_5": 0.45972,
714
+ "map_at_10": 0.50866,
715
+ "map_at_20": 0.5276,
716
+ "map_at_50": 0.54534,
717
+ "map_at_100": 0.55161,
718
+ "recall_at_1": 0.29474,
719
+ "recall_at_3": 0.48191,
720
+ "recall_at_5": 0.61543,
721
+ "recall_at_10": 0.75002,
722
+ "recall_at_20": 0.85537,
723
+ "recall_at_50": 0.94841,
724
+ "recall_at_100": 0.98246,
725
  "precision_at_1": 0.52632,
726
+ "precision_at_3": 0.34503,
727
+ "precision_at_5": 0.28421,
728
+ "precision_at_10": 0.19649,
729
+ "precision_at_20": 0.12281,
730
+ "precision_at_50": 0.06912,
731
+ "precision_at_100": 0.03877,
732
+ "mrr_at_1": 0.543859649122807,
733
  "mrr_at_3": 0.6403508771929824,
734
+ "mrr_at_5": 0.6526315789473685,
735
+ "mrr_at_10": 0.6602548036758563,
736
+ "mrr_at_20": 0.6651739829371409,
737
+ "mrr_at_50": 0.6655248601301235,
738
+ "mrr_at_100": 0.6655248601301235,
739
+ "naucs_at_1_max": 0.07429395157545386,
740
+ "naucs_at_1_std": 0.03015644215746205,
741
+ "naucs_at_1_diff1": 0.029188519373247488,
742
+ "naucs_at_3_max": -0.14556834354246306,
743
+ "naucs_at_3_std": -0.03551874034674307,
744
+ "naucs_at_3_diff1": -0.03426575331185958,
745
+ "naucs_at_5_max": -0.132667608690031,
746
+ "naucs_at_5_std": 0.046883073270420496,
747
+ "naucs_at_5_diff1": 0.020789980948042533,
748
+ "naucs_at_10_max": -0.15350074281217774,
749
+ "naucs_at_10_std": 0.021602996730365333,
750
+ "naucs_at_10_diff1": 0.10610855445931551,
751
+ "naucs_at_20_max": -0.2870595843942612,
752
+ "naucs_at_20_std": -0.053200969005437526,
753
+ "naucs_at_20_diff1": 0.009614869287472011,
754
+ "naucs_at_50_max": -0.3747840510110352,
755
+ "naucs_at_50_std": -0.1315278407645466,
756
+ "naucs_at_50_diff1": -0.08682413153737878,
757
+ "naucs_at_100_max": -0.3797677839881161,
758
+ "naucs_at_100_std": -0.11609328262961503,
759
+ "naucs_at_100_diff1": -0.12296867865157128
760
  },
761
  "vidore/synthetic_axa_filtered_v1.0": {
762
+ "ndcg_at_1": 0.61111,
763
+ "ndcg_at_3": 0.6462,
764
+ "ndcg_at_5": 0.68292,
765
+ "ndcg_at_10": 0.70017,
766
+ "ndcg_at_20": 0.72201,
767
+ "ndcg_at_50": 0.7613,
768
+ "ndcg_at_100": 0.7658,
769
+ "map_at_1": 0.29544,
770
+ "map_at_3": 0.41914,
771
+ "map_at_5": 0.49958,
772
+ "map_at_10": 0.57784,
773
+ "map_at_20": 0.61104,
774
+ "map_at_50": 0.63144,
775
+ "map_at_100": 0.63315,
776
+ "recall_at_1": 0.29544,
777
+ "recall_at_3": 0.51973,
778
+ "recall_at_5": 0.66378,
779
+ "recall_at_10": 0.78472,
780
+ "recall_at_20": 0.85945,
781
+ "recall_at_50": 0.97412,
782
  "recall_at_100": 0.98765,
783
+ "precision_at_1": 0.61111,
784
+ "precision_at_3": 0.46296,
785
+ "precision_at_5": 0.42222,
786
+ "precision_at_10": 0.31111,
787
+ "precision_at_20": 0.19167,
788
+ "precision_at_50": 0.09111,
789
  "precision_at_100": 0.04667,
790
  "mrr_at_1": 0.5555555555555556,
791
+ "mrr_at_3": 0.6759259259259259,
792
+ "mrr_at_5": 0.7037037037037037,
793
+ "mrr_at_10": 0.7037037037037037,
794
+ "mrr_at_20": 0.7037037037037037,
795
+ "mrr_at_50": 0.7060185185185185,
796
+ "mrr_at_100": 0.7060185185185185,
797
+ "naucs_at_1_max": 0.06370640145678692,
798
+ "naucs_at_1_std": 0.42153041887902504,
799
+ "naucs_at_1_diff1": 0.35531111328419196,
800
+ "naucs_at_3_max": -0.49137089517804067,
801
+ "naucs_at_3_std": -0.41095639069482337,
802
+ "naucs_at_3_diff1": 0.01707919337550571,
803
+ "naucs_at_5_max": -0.6120658234932744,
804
+ "naucs_at_5_std": -0.474389941591449,
805
+ "naucs_at_5_diff1": -0.165061956148533,
806
+ "naucs_at_10_max": -0.6883907404624792,
807
+ "naucs_at_10_std": -0.45513671453258775,
808
+ "naucs_at_10_diff1": -0.42316346500786917,
809
+ "naucs_at_20_max": -0.733957515102311,
810
+ "naucs_at_20_std": -0.4723325528538845,
811
+ "naucs_at_20_diff1": -0.4807537601912433,
812
+ "naucs_at_50_max": -0.7013881037720148,
813
+ "naucs_at_50_std": -0.45324759826825234,
814
+ "naucs_at_50_diff1": -0.4882986634795409,
815
+ "naucs_at_100_max": -0.6924248438890106,
816
+ "naucs_at_100_std": -0.4400487565870318,
817
+ "naucs_at_100_diff1": -0.4661845036852439
818
  },
819
  "vidore/synthetic_rse_restaurant_filtered_v1.0_multilingual": {
820
+ "ndcg_at_1": 0.54825,
821
+ "ndcg_at_3": 0.53008,
822
+ "ndcg_at_5": 0.56705,
823
+ "ndcg_at_10": 0.6257,
824
+ "ndcg_at_20": 0.6625,
825
+ "ndcg_at_50": 0.69353,
826
+ "ndcg_at_100": 0.70465,
827
+ "map_at_1": 0.29816,
828
+ "map_at_3": 0.4016,
829
+ "map_at_5": 0.4527,
830
+ "map_at_10": 0.50738,
831
+ "map_at_20": 0.52962,
832
+ "map_at_50": 0.5466,
833
+ "map_at_100": 0.55224,
834
+ "recall_at_1": 0.29816,
835
+ "recall_at_3": 0.48014,
836
+ "recall_at_5": 0.59266,
837
+ "recall_at_10": 0.76588,
838
+ "recall_at_20": 0.87318,
839
+ "recall_at_50": 0.95235,
840
+ "recall_at_100": 0.98465,
841
+ "precision_at_1": 0.54825,
842
+ "precision_at_3": 0.34795,
843
+ "precision_at_5": 0.28158,
844
+ "precision_at_10": 0.19956,
845
+ "precision_at_20": 0.12654,
846
+ "precision_at_50": 0.06991,
847
+ "precision_at_100": 0.03877,
848
+ "mrr_at_1": 0.5394736842105263,
849
+ "mrr_at_3": 0.6381578947368421,
850
+ "mrr_at_5": 0.6528508771929824,
851
+ "mrr_at_10": 0.666278543581175,
852
+ "mrr_at_20": 0.6693822460076329,
853
+ "mrr_at_50": 0.6696454039023697,
854
+ "mrr_at_100": 0.6697038834345334,
855
+ "naucs_at_1_max": -0.10947777032499051,
856
+ "naucs_at_1_std": -0.08674361998375714,
857
+ "naucs_at_1_diff1": 0.057249370353071344,
858
+ "naucs_at_3_max": -0.06687224875668692,
859
+ "naucs_at_3_std": 0.0484364629589946,
860
+ "naucs_at_3_diff1": -0.05462426569089335,
861
+ "naucs_at_5_max": -0.08403937224497868,
862
+ "naucs_at_5_std": 0.07293419772827552,
863
+ "naucs_at_5_diff1": -0.05958666027516894,
864
+ "naucs_at_10_max": -0.10838130867204888,
865
+ "naucs_at_10_std": 0.05587982504285361,
866
+ "naucs_at_10_diff1": 0.002192750308197299,
867
+ "naucs_at_20_max": -0.1628596976822229,
868
+ "naucs_at_20_std": 0.06415536794747155,
869
+ "naucs_at_20_diff1": -0.0901131230050924,
870
+ "naucs_at_50_max": -0.2471444987505313,
871
+ "naucs_at_50_std": 0.02178464235724526,
872
+ "naucs_at_50_diff1": -0.16484319783599452,
873
+ "naucs_at_100_max": -0.2476326734790779,
874
+ "naucs_at_100_std": 0.04921366491713892,
875
+ "naucs_at_100_diff1": -0.1925485969942229
876
  },
877
  "vidore/synthetic_axa_filtered_v1.0_multilingual": {
878
+ "ndcg_at_1": 0.56944,
879
+ "ndcg_at_3": 0.60114,
880
+ "ndcg_at_5": 0.61308,
881
+ "ndcg_at_10": 0.63301,
882
+ "ndcg_at_20": 0.65962,
883
+ "ndcg_at_50": 0.70087,
884
+ "ndcg_at_100": 0.71794,
885
+ "map_at_1": 0.26807,
886
+ "map_at_3": 0.38274,
887
+ "map_at_5": 0.44543,
888
+ "map_at_10": 0.51783,
889
+ "map_at_20": 0.55091,
890
+ "map_at_50": 0.57108,
891
+ "map_at_100": 0.57464,
892
+ "recall_at_1": 0.26807,
893
+ "recall_at_3": 0.47345,
894
+ "recall_at_5": 0.56537,
895
+ "recall_at_10": 0.69703,
896
+ "recall_at_20": 0.78343,
897
+ "recall_at_50": 0.90152,
898
+ "recall_at_100": 0.97889,
899
+ "precision_at_1": 0.56944,
900
+ "precision_at_3": 0.43056,
901
+ "precision_at_5": 0.38056,
902
+ "precision_at_10": 0.27917,
903
+ "precision_at_20": 0.17569,
904
+ "precision_at_50": 0.08583,
905
+ "precision_at_100": 0.04569,
906
+ "mrr_at_1": 0.5,
907
+ "mrr_at_3": 0.6481481481481483,
908
+ "mrr_at_5": 0.6550925925925927,
909
+ "mrr_at_10": 0.6566358024691359,
910
+ "mrr_at_20": 0.6577041785375121,
911
+ "mrr_at_50": 0.659896412080552,
912
+ "mrr_at_100": 0.6607831288544433,
913
+ "naucs_at_1_max": 0.16468725740231643,
914
+ "naucs_at_1_std": 0.37919017176717934,
915
+ "naucs_at_1_diff1": 0.30214037198755495,
916
+ "naucs_at_3_max": -0.2419012234949494,
917
+ "naucs_at_3_std": -0.11880659562104584,
918
+ "naucs_at_3_diff1": -0.05983117053027332,
919
+ "naucs_at_5_max": -0.3435266789444122,
920
+ "naucs_at_5_std": -0.15122373891433766,
921
+ "naucs_at_5_diff1": -0.14301810021483025,
922
+ "naucs_at_10_max": -0.4049299119799346,
923
+ "naucs_at_10_std": -0.11551373909975436,
924
+ "naucs_at_10_diff1": -0.3087348082925975,
925
+ "naucs_at_20_max": -0.4557080881366001,
926
+ "naucs_at_20_std": -0.1316035057344777,
927
+ "naucs_at_20_diff1": -0.3767740930238455,
928
+ "naucs_at_50_max": -0.48221867587284106,
929
+ "naucs_at_50_std": -0.16520884486012927,
930
+ "naucs_at_50_diff1": -0.437539802211166,
931
+ "naucs_at_100_max": -0.5041035001503106,
932
+ "naucs_at_100_std": -0.19246332353391876,
933
+ "naucs_at_100_diff1": -0.450877397124322
934
  },
935
  "vidore/synthetic_mit_biomedical_tissue_interactions_unfiltered_multilingual": {
936
+ "ndcg_at_1": 0.62656,
937
+ "ndcg_at_3": 0.62898,
938
+ "ndcg_at_5": 0.64249,
939
+ "ndcg_at_10": 0.67441,
940
+ "ndcg_at_20": 0.69935,
941
+ "ndcg_at_50": 0.72034,
942
+ "ndcg_at_100": 0.7304,
943
+ "map_at_1": 0.37553,
944
+ "map_at_3": 0.51341,
945
+ "map_at_5": 0.55134,
946
+ "map_at_10": 0.58654,
947
+ "map_at_20": 0.60275,
948
+ "map_at_50": 0.61047,
949
+ "map_at_100": 0.61272,
950
+ "recall_at_1": 0.37553,
951
+ "recall_at_3": 0.58317,
952
+ "recall_at_5": 0.6651,
953
+ "recall_at_10": 0.76267,
954
+ "recall_at_20": 0.83047,
955
+ "recall_at_50": 0.89402,
956
+ "recall_at_100": 0.93005,
957
+ "precision_at_1": 0.62656,
958
+ "precision_at_3": 0.39115,
959
+ "precision_at_5": 0.28812,
960
+ "precision_at_10": 0.18453,
961
+ "precision_at_20": 0.10859,
962
+ "precision_at_50": 0.05059,
963
+ "precision_at_100": 0.02756,
964
+ "mrr_at_1": 0.6203125,
965
+ "mrr_at_3": 0.7114583333333329,
966
+ "mrr_at_5": 0.7251302083333327,
967
+ "mrr_at_10": 0.7312574404761898,
968
+ "mrr_at_20": 0.7330658025942703,
969
+ "mrr_at_50": 0.7337640283895484,
970
+ "mrr_at_100": 0.733871254289439,
971
+ "naucs_at_1_max": 0.2039250412963779,
972
+ "naucs_at_1_std": -0.07024884868861188,
973
+ "naucs_at_1_diff1": 0.45606770903022215,
974
+ "naucs_at_3_max": 0.058212985355640734,
975
+ "naucs_at_3_std": -0.05959626421557797,
976
+ "naucs_at_3_diff1": -0.029625222250297182,
977
+ "naucs_at_5_max": 0.012523889018321476,
978
+ "naucs_at_5_std": -0.05950139173785889,
979
+ "naucs_at_5_diff1": -0.12093229795276991,
980
+ "naucs_at_10_max": -0.07885246896142604,
981
+ "naucs_at_10_std": -0.07584435766192119,
982
+ "naucs_at_10_diff1": -0.19868179893511964,
983
+ "naucs_at_20_max": -0.08104765093094657,
984
+ "naucs_at_20_std": -0.020633911923048417,
985
+ "naucs_at_20_diff1": -0.2329474014068329,
986
+ "naucs_at_50_max": -0.10191575025539294,
987
+ "naucs_at_50_std": 0.006497205696773289,
988
+ "naucs_at_50_diff1": -0.2654796386435107,
989
+ "naucs_at_100_max": -0.12912675673565918,
990
+ "naucs_at_100_std": 0.002658413394124888,
991
+ "naucs_at_100_diff1": -0.286740024806076
992
  },
993
  "vidore/synthetics_economics_macro_economy_2024_filtered_v1.0_multilingual": {
994
+ "ndcg_at_1": 0.56897,
995
+ "ndcg_at_3": 0.55347,
996
+ "ndcg_at_5": 0.54697,
997
+ "ndcg_at_10": 0.54353,
998
+ "ndcg_at_20": 0.5754,
999
+ "ndcg_at_50": 0.64435,
1000
+ "ndcg_at_100": 0.68217,
1001
+ "map_at_1": 0.06995,
1002
+ "map_at_3": 0.16788,
1003
+ "map_at_5": 0.22262,
1004
+ "map_at_10": 0.29539,
1005
+ "map_at_20": 0.35806,
1006
+ "map_at_50": 0.41596,
1007
+ "map_at_100": 0.44377,
1008
+ "recall_at_1": 0.06995,
1009
+ "recall_at_3": 0.22001,
1010
+ "recall_at_5": 0.31105,
1011
+ "recall_at_10": 0.45156,
1012
+ "recall_at_20": 0.60587,
1013
+ "recall_at_50": 0.80629,
1014
+ "recall_at_100": 0.91617,
1015
+ "precision_at_1": 0.56897,
1016
+ "precision_at_3": 0.52155,
1017
+ "precision_at_5": 0.49483,
1018
+ "precision_at_10": 0.4069,
1019
+ "precision_at_20": 0.31207,
1020
+ "precision_at_50": 0.19595,
1021
+ "precision_at_100": 0.1275,
1022
+ "mrr_at_1": 0.5689655172413793,
1023
+ "mrr_at_3": 0.7040229885057472,
1024
+ "mrr_at_5": 0.7173850574712642,
1025
+ "mrr_at_10": 0.7212370005473451,
1026
+ "mrr_at_20": 0.7221635921046082,
1027
+ "mrr_at_50": 0.7223688466202075,
1028
+ "mrr_at_100": 0.7223688466202075,
1029
+ "naucs_at_1_max": 0.008287845664171043,
1030
+ "naucs_at_1_std": 0.1276905868842376,
1031
+ "naucs_at_1_diff1": 0.11685328065061831,
1032
+ "naucs_at_3_max": 0.04231917732527594,
1033
+ "naucs_at_3_std": 0.09097968258247123,
1034
+ "naucs_at_3_diff1": 0.07039779015197889,
1035
+ "naucs_at_5_max": 0.02450452954672108,
1036
+ "naucs_at_5_std": 0.08382631941812721,
1037
+ "naucs_at_5_diff1": 0.04903893807286711,
1038
+ "naucs_at_10_max": 0.06258885863723426,
1039
+ "naucs_at_10_std": 0.1489118063390922,
1040
+ "naucs_at_10_diff1": -0.001772551183085018,
1041
+ "naucs_at_20_max": 0.05127122134169797,
1042
+ "naucs_at_20_std": 0.16207987797849627,
1043
+ "naucs_at_20_diff1": -0.037084808056486104,
1044
+ "naucs_at_50_max": 0.012079926402609917,
1045
+ "naucs_at_50_std": 0.16924631618333094,
1046
+ "naucs_at_50_diff1": -0.09727554327201936,
1047
+ "naucs_at_100_max": -0.03658022842524374,
1048
+ "naucs_at_100_std": 0.12849977057146159,
1049
+ "naucs_at_100_diff1": -0.10217040873502983
1050
  },
1051
  "vidore/restaurant_esg_reports_beir": {
1052
+ "ndcg_at_1": 0.71795,
1053
+ "ndcg_at_3": 0.71188,
1054
+ "ndcg_at_5": 0.73949,
1055
+ "ndcg_at_10": 0.77205,
1056
+ "ndcg_at_20": 0.78712,
1057
+ "ndcg_at_50": 0.80866,
1058
+ "ndcg_at_100": 0.81008,
1059
+ "map_at_1": 0.50513,
1060
+ "map_at_3": 0.62382,
1061
+ "map_at_5": 0.6711,
1062
+ "map_at_10": 0.70078,
1063
+ "map_at_20": 0.7096,
1064
+ "map_at_50": 0.71767,
1065
+ "map_at_100": 0.71825,
1066
+ "recall_at_1": 0.50513,
1067
+ "recall_at_3": 0.68668,
1068
+ "recall_at_5": 0.76906,
1069
+ "recall_at_10": 0.85835,
1070
+ "recall_at_20": 0.90725,
1071
+ "recall_at_50": 0.97476,
1072
+ "recall_at_100": 0.98071,
1073
+ "precision_at_1": 0.73077,
1074
+ "precision_at_3": 0.38462,
1075
+ "precision_at_5": 0.28846,
1076
+ "precision_at_10": 0.175,
1077
+ "precision_at_20": 0.09712,
1078
+ "precision_at_50": 0.045,
1079
+ "precision_at_100": 0.02288,
1080
+ "mrr_at_1": 0.75,
1081
+ "mrr_at_3": 0.8076923076923077,
1082
+ "mrr_at_5": 0.8125,
1083
+ "mrr_at_10": 0.8226495726495725,
1084
+ "mrr_at_20": 0.8226495726495725,
1085
+ "mrr_at_50": 0.8233363858363858,
1086
+ "mrr_at_100": 0.8233363858363858,
1087
+ "naucs_at_1_max": 0.012222350607946745,
1088
+ "naucs_at_1_std": -0.09466371539982948,
1089
+ "naucs_at_1_diff1": 0.5022751223737485,
1090
+ "naucs_at_3_max": -0.03148341001971903,
1091
+ "naucs_at_3_std": -0.017366247697995007,
1092
+ "naucs_at_3_diff1": -0.2806074619839212,
1093
+ "naucs_at_5_max": -0.1368097107443235,
1094
+ "naucs_at_5_std": -0.05686309477532154,
1095
+ "naucs_at_5_diff1": -0.34542841772396893,
1096
+ "naucs_at_10_max": -0.11047704445523453,
1097
+ "naucs_at_10_std": 0.016760999608655385,
1098
+ "naucs_at_10_diff1": -0.36219386821062893,
1099
+ "naucs_at_20_max": -0.18118046303544919,
1100
+ "naucs_at_20_std": 0.00660518472055942,
1101
+ "naucs_at_20_diff1": -0.35322674833881423,
1102
+ "naucs_at_50_max": -0.15496181424390146,
1103
+ "naucs_at_50_std": 0.025351601144475155,
1104
+ "naucs_at_50_diff1": -0.3271145822876798,
1105
+ "naucs_at_100_max": -0.14539299033899847,
1106
+ "naucs_at_100_std": 0.020343129158802924,
1107
+ "naucs_at_100_diff1": -0.3237772572696072
1108
  }
1109
  }
1110
  }
training_config.yml CHANGED
@@ -1,6 +1,6 @@
1
  config:
2
  (): colpali_engine.trainer.colmodel_training.ColModelTrainingConfig
3
- output_dir: !path ../../../models/colqwen2_5_train_single_source_3ep_r32_512bs_vdr_7b
4
  processor:
5
  (): colpali_engine.utils.transformers_wrappers.AllPurposeWrapper
6
  class_to_instanciate: !ext colpali_engine.models.ColQwen2_5_Processor
@@ -53,13 +53,13 @@ config:
53
  logging_steps: 10
54
  eval_steps: 100
55
  warmup_ratio: 0.01
56
- learning_rate: 2e-4
57
  save_total_limit: 1
58
  # resume_from_checkpoint: true
59
  optim: "paged_adamw_8bit"
60
  # wandb logging
61
  # wandb_project: "mllm"
62
- run_name: "colqwen2_5_train_single_source_3ep_r32_512bs_vdr_7b"
63
  report_to: "wandb"
64
 
65
 
@@ -72,4 +72,4 @@ config:
72
  bias: "none"
73
  task_type: "FEATURE_EXTRACTION"
74
  target_modules: '(.*(model).*(down_proj|gate_proj|up_proj|k_proj|q_proj|v_proj|o_proj).*$|.*(custom_text_proj).*$)'
75
- # target_modules: '(.*(language_model).*(down_proj|gate_proj|up_proj|k_proj|q_proj|v_proj|o_proj).*$|.*(custom_text_proj).*$)'
 
1
  config:
2
  (): colpali_engine.trainer.colmodel_training.ColModelTrainingConfig
3
+ output_dir: !path ../../../models/colqwen2_5_train_single_source_3ep_r32_512bs_vdr_7b_1e4
4
  processor:
5
  (): colpali_engine.utils.transformers_wrappers.AllPurposeWrapper
6
  class_to_instanciate: !ext colpali_engine.models.ColQwen2_5_Processor
 
53
  logging_steps: 10
54
  eval_steps: 100
55
  warmup_ratio: 0.01
56
+ learning_rate: 1e-4
57
  save_total_limit: 1
58
  # resume_from_checkpoint: true
59
  optim: "paged_adamw_8bit"
60
  # wandb logging
61
  # wandb_project: "mllm"
62
+ run_name: "colqwen2_5_train_single_source_3ep_r32_512bs_vdr_7b_1e4"
63
  report_to: "wandb"
64
 
65
 
 
72
  bias: "none"
73
  task_type: "FEATURE_EXTRACTION"
74
  target_modules: '(.*(model).*(down_proj|gate_proj|up_proj|k_proj|q_proj|v_proj|o_proj).*$|.*(custom_text_proj).*$)'
75
+ # target_modules: '(.*(language_model).*(down_proj|gate_proj|up_proj|k_proj|q_proj|v_proj|o_proj).*$|.*(custom_text_proj).*$)'