Post
914
The demo for Camel-Doc-OCR-062825 (exp) is optimized for document retrieval and direct Markdown (.md) generation from images and PDFs. Additional demos include OCRFlux-3B (document OCR), VilaSR (spatial reasoning with visual drawing), and ShotVL (cinematic language understanding). 🐪
✦ Space : prithivMLmods/Doc-VLMs-v2-Localization
Models :
⤷ camel-doc-ocr-062825 : prithivMLmods/Camel-Doc-OCR-062825
⤷ ocrflux-3b : ChatDOC/OCRFlux-3B
⤷ vilasr : AntResearchNLP/ViLaSR
⤷ shotvl : Vchitect/ShotVL-7B
⤷ Multimodal Implementations : prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0
The community GPU grant was given by Hugging Face — special thanks to them. This space supports the following tasks: (image inference, video inference) with result markdown canvas and object detection/localization. 🤗🚀
.
.
.
To know more about it, visit the model card of the respective model. !!
✦ Space : prithivMLmods/Doc-VLMs-v2-Localization
Models :
⤷ camel-doc-ocr-062825 : prithivMLmods/Camel-Doc-OCR-062825
⤷ ocrflux-3b : ChatDOC/OCRFlux-3B
⤷ vilasr : AntResearchNLP/ViLaSR
⤷ shotvl : Vchitect/ShotVL-7B
⤷ Multimodal Implementations : prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0
The community GPU grant was given by Hugging Face — special thanks to them. This space supports the following tasks: (image inference, video inference) with result markdown canvas and object detection/localization. 🤗🚀
.
.
.
To know more about it, visit the model card of the respective model. !!