07/11 ~ Visual Understanding Collection Deep Caption VL - Trained with variable-dimensional pairs • 2 items • Updated about 8 hours ago
Multimodal Implementations Collection Image-Text-to-Text Demo's • 11 items • Updated about 15 hours ago • 4
Running on Zero MCP 165 165 DocScope-R1 📰 cosmos reason1 / docscopeocr / captioner relaxed / visionocr
Running on Zero MCP 100 100 VisionScope-R2 🔍 behemoth-3b / skycaptioner /spacethinker / spaceom / coreocr
Running on Zero MCP 3 3 Multimodal VLMs 🪐 vision matters / r vision / vigal / visionary r / monkey ocr
Running on Zero MCP 13 13 Doc VLMs V2 Localization 🐪 camel-doc-ocr / vilasr-7b / ocrflux-3b / shotvl-7b