Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Drishti 's Collections
vlm-unlearning
vlm-unlearning-benchmarks
LLMs
Code models
music generation
OCR/VLMs
biomed ner models + spaces
biomed ner
med benchmarks
medllms
STT
Podcast
Summarizer (Mono + Multi-lingual)
Hugging Face
Meal Planner
Cool chatbots
Social Media
Translate
Personal Stylist + Ecom Assistant
Elsa
Professional Development
Doc/PDF RAG
Consilium
Travel Planner
watch AI learn
Research Co-pilot
multi-agent
Code Agent
GitHub
Search and Monitor Gradio MCP Server + REST API
Environment/Climate/Agriculture
OCR
MCP Router + Customizable MCP Agents
Imp Leaderboards
medical/clinical/health
web search + scrape
TTS
One-stop Knowledge Solution
Intellectual Property One-Stop Solution
VLMs

OCR/VLMs

updated 23 days ago
Upvote
-

  • moonshotai/Kimi-VL-A3B-Thinking-2506

    Image-Text-to-Text • 16B • Updated 22 days ago • 43.6k • 221

    Note - powerful reasoning vision LM, 3B active params, smarter with less tokens, supports long documents, videos


  • nanonets/Nanonets-OCR-s

    Image-Text-to-Text • 4B • Updated 28 days ago • 287k • 1.41k

    Note 3.75B params OCR model based on Qwen2.5VL-3B-Instruct (OS)

Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs