Spaces:

davanstrien
/

ColPali-Query-Generator

Running on Zero

davanstrien HF Staff commited on Sep 24, 2024

Commit

6c3e99a

1 Parent(s): e3dcfdd

formatting

Files changed (1) hide show

app.py CHANGED Viewed

@@ -135,8 +135,10 @@ def generate_response(image):
 title = "ColPali fine-tuning Query Generator"
 description = """[ColPali](https://huggingface.co/papers/2407.01449) is a very exciting new approach to multimodal document retrieval which aims to replace existing document retrievers which often rely on an OCR step with an end-to-end multimodal approach.
-To train ColPali models, we need a dataset of image-text pairs which represent the document images and the relevant text queries which those documents should match.
 To make the ColPali models work even better we might want a dataset of query/image document pairs related to our domain or task.
 One way in which we might go about generating such a dataset is to use an VLM to generate synthetic queries for us.
 This space uses the [Qwen/Qwen2-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct) to generate queries for a document, based on an input document image.

 title = "ColPali fine-tuning Query Generator"
 description = """[ColPali](https://huggingface.co/papers/2407.01449) is a very exciting new approach to multimodal document retrieval which aims to replace existing document retrievers which often rely on an OCR step with an end-to-end multimodal approach.
+To train or fine-tune a ColPali model, we need a dataset of image-text pairs which represent the document images and the relevant text queries which those documents should match.
 To make the ColPali models work even better we might want a dataset of query/image document pairs related to our domain or task.
 One way in which we might go about generating such a dataset is to use an VLM to generate synthetic queries for us.
 This space uses the [Qwen/Qwen2-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct) to generate queries for a document, based on an input document image.