Running on Zero 13 13 Explainable-Vision-Language-Model 🥶 Generate a video visualizing how a multimodal model attends to an image while generating text
OCR dataset Collection Vietnamese OCR dataset, including word level and line level image data. • 41 items • Updated Apr 11, 2024 • 6