OCR/VLMs - a Drishti Collection

Drishti 's Collections

vlm-unlearning-benchmarks

LLMs

music generation

biomed ner models + spaces

medllms

STT

Podcast

Summarizer (Mono + Multi-lingual)

Personal Stylist + Ecom Assistant

Elsa

Professional Development

Research Co-pilot

GitHub

Search and Monitor Gradio MCP Server + REST API

Environment/Climate/Agriculture

OCR

MCP Router + Customizable MCP Agents

Imp Leaderboards

medical/clinical/health

web search + scrape

TTS

One-stop Knowledge Solution

Intellectual Property One-Stop Solution

VLMs

OCR/VLMs

updated 23 days ago

moonshotai/Kimi-VL-A3B-Thinking-2506

Image-Text-to-Text • 16B • Updated 22 days ago • 43.6k • 221

Note - powerful reasoning vision LM, 3B active params, smarter with less tokens, supports long documents, videos
nanonets/Nanonets-OCR-s

Image-Text-to-Text • 4B • Updated 28 days ago • 287k • 1.41k

Note 3.75B params OCR model based on Qwen2.5VL-3B-Instruct (OS)