NTT-hil-insight
/

VDocGenerator-Phi3-vision

Model card Files Files and versions Community

RyotaTanaka commited on Apr 16

Commit

2fa451b

·

verified ·

1 Parent(s): 9c42d18

Update README.md

Files changed (1) hide show

README.md +2 -3

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ language:
 - en
 base_model:
 - microsoft/Phi-3-vision-128k-instruct
-library_name: transformers
 ---
 # VDocRAG: Retrieval-Augmented Generation over Visually-Rich Documents
 **VDocRAG** is a new RAG framework that can directly understand diverse real-world documents purely from visual features. It was introduced in the paper [VDocRAG: Retrieval-Augmented Generation over Visually-Rich Documents](http://arxiv.org/abs/2504.09795) by Tanaka et al. and first released in [this repository](https://github.com/nttmdlab-nlp/VDocRAG).
@@ -80,5 +80,4 @@ The models and weights of VDocRAG in this repo are released under the [NTT Licen
   booktitle = {CVPR},
   year      = {2025}
 }
-```

 - en
 base_model:
 - microsoft/Phi-3-vision-128k-instruct
+library_name: peft
 ---
 # VDocRAG: Retrieval-Augmented Generation over Visually-Rich Documents
 **VDocRAG** is a new RAG framework that can directly understand diverse real-world documents purely from visual features. It was introduced in the paper [VDocRAG: Retrieval-Augmented Generation over Visually-Rich Documents](http://arxiv.org/abs/2504.09795) by Tanaka et al. and first released in [this repository](https://github.com/nttmdlab-nlp/VDocRAG).
   booktitle = {CVPR},
   year      = {2025}
 }
+```