Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
lbourdois 's Collections
French Translations
FAT5
French prompts
French NER
French QA
French VQA datasets
French caption datasets
French OCR datasets
French retriever datasets
French audio datasets (pretraining)

French OCR datasets

updated May 10

Datasets I cleaned with an image, a prompt question (like "transcribe the text in this image") and an answer. Can be used to train VLMs.

Upvote
-

  • lbourdois/OCR-neulab-PangeaInstruct-OCR-clean

    Viewer • Updated Mar 5 • 30k • 169

  • lbourdois/OCR-liboaccn-OPUS-MIT-5M-clean

    Viewer • Updated Mar 5 • 530k • 293
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs