5 210 2

Bhimraj Yadav

bhimrazy

https://bhimraj.com.np

AI & ML interests

Computer Vision, Healthcare, Generative AI and NLP

Recent Activity

upvoted a paper about 2 months ago

Meissa: Multi-modal Medical Agentic Intelligence

upvoted a paper about 2 months ago

Multimodal OCR: Parse Anything from Documents

upvoted an article 6 months ago

Supercharge your OCR Pipelines with Open Models

View all activity

Organizations

upvoted 2 papers about 2 months ago

Meissa: Multi-modal Medical Agentic Intelligence

Paper • 2603.09018 • Published Mar 9 • 5

Multimodal OCR: Parse Anything from Documents

Paper • 2603.13032 • Published Mar 13 • 43

upvoted an article 6 months ago

Article

Supercharge your OCR Pipelines with Open Models

Oct 21, 2025

•

309

upvoted 9 papers 7 months ago

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Paper • 2510.04618 • Published Oct 6, 2025 • 131

Baseer: A Vision-Language Model for Arabic Document-to-Markdown OCR

Paper • 2509.18174 • Published Sep 17, 2025 • 134

EmbeddingGemma: Powerful and Lightweight Text Representations

Paper • 2509.20354 • Published Sep 24, 2025 • 49

Docling Technical Report

Paper • 2408.09869 • Published Aug 19, 2024 • 2

Docling: An Efficient Open-Source Toolkit for AI-driven Document Conversion

Paper • 2501.17887 • Published Jan 27, 2025 • 1

Qwen3-Omni Technical Report

Paper • 2509.17765 • Published Sep 22, 2025 • 153

upvoted 4 papers 8 months ago

FastVLM: Efficient Vision Encoding for Vision Language Models

Paper • 2412.13303 • Published Dec 17, 2024 • 76

MobileCLIP2: Improving Multi-Modal Reinforced Training

Paper • 2508.20691 • Published Aug 28, 2025 • 7

VibeVoice Technical Report

Paper • 2508.19205 • Published Aug 26, 2025 • 166

MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published Aug 28, 2025 • 63

upvoted an article 12 months ago

Article

Vision Language Models (Better, faster, stronger)

May 12, 2025

•

609

upvoted 3 papers about 1 year ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7, 2025 • 207

Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26, 2025 • 172

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13, 2025 • 172

Bhimraj Yadav

AI & ML interests

Recent Activity

Organizations

bhimrazy's activity

Supercharge your OCR Pipelines with Open Models

Vision Language Models (Better, faster, stronger)