Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Cuiunbo 's Collections
VLM dataset
MiniCPM-V
VLM For OCR
Dataset For OCR
audio

VLM For OCR

updated Jun 29, 2024
Upvote
3

  • Qwen/Qwen-VL

    Text Generation • Updated Jan 25, 2024 • 27.2k • 247

  • google/pix2struct-large

    Image-to-Text • 1B • Updated Sep 6, 2023 • 5.92k • 34

  • THUDM/cogagent-chat-hf

    Text Generation • 18B • Updated Dec 24, 2024 • 880 • 69

  • openbmb/MiniCPM-Llama3-V-2_5

    Image-Text-to-Text • 9B • Updated Jan 15 • 37.8k • 1.4k

  • google/paligemma-3b-pt-896

    Image-Text-to-Text • 3B • Updated 17 days ago • 2.26k • 119

  • UCSC-VLAA/Recap-DataComp-1B

    Viewer • Updated Jan 9 • 1.88B • 3.38k • 176

  • WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences

    Paper • 2406.11069 • Published Jun 16, 2024 • 14

  • pbevan11/synthetic-ocr-correction-gpt4o

    Viewer • Updated Jul 25, 2024 • 10k • 17 • 5

  • yifeihu/ACL-23-Paper-OCR-Markdown

    Viewer • Updated Jun 8, 2024 • 2.15k • 27 • 17

  • LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs

    Paper • 2406.15319 • Published Jun 21, 2024 • 65
Upvote
3
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs