Spaces:
Running
on
Zero
Running
on
Zero
Commit
β’
6c3e99a
1
Parent(s):
e3dcfdd
formatting
Browse files
app.py
CHANGED
@@ -135,8 +135,10 @@ def generate_response(image):
|
|
135 |
|
136 |
title = "ColPali fine-tuning Query Generator"
|
137 |
description = """[ColPali](https://huggingface.co/papers/2407.01449) is a very exciting new approach to multimodal document retrieval which aims to replace existing document retrievers which often rely on an OCR step with an end-to-end multimodal approach.
|
138 |
-
|
|
|
139 |
To make the ColPali models work even better we might want a dataset of query/image document pairs related to our domain or task.
|
|
|
140 |
One way in which we might go about generating such a dataset is to use an VLM to generate synthetic queries for us.
|
141 |
This space uses the [Qwen/Qwen2-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct) to generate queries for a document, based on an input document image.
|
142 |
|
|
|
135 |
|
136 |
title = "ColPali fine-tuning Query Generator"
|
137 |
description = """[ColPali](https://huggingface.co/papers/2407.01449) is a very exciting new approach to multimodal document retrieval which aims to replace existing document retrievers which often rely on an OCR step with an end-to-end multimodal approach.
|
138 |
+
|
139 |
+
To train or fine-tune a ColPali model, we need a dataset of image-text pairs which represent the document images and the relevant text queries which those documents should match.
|
140 |
To make the ColPali models work even better we might want a dataset of query/image document pairs related to our domain or task.
|
141 |
+
|
142 |
One way in which we might go about generating such a dataset is to use an VLM to generate synthetic queries for us.
|
143 |
This space uses the [Qwen/Qwen2-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct) to generate queries for a document, based on an input document image.
|
144 |
|