Haoran Wei's picture

Haoran Wei

HaoranWei

·

AI & ML interests

LLM，CV，OVOD

Recent Activity

liked a model 16 days ago

ds4sd/SmolDocling-256M-preview

liked a model about 2 months ago

can-gaa-hou/GOT-OCR2.0-OpenVINO-INT4

new activity about 2 months ago

stepfun-ai/GOT-OCR2_0:Update README.md

View all activity

Organizations

HaoranWei's activity

upvoted a paper 3 months ago

Slow Perception: Let's Perceive Geometric Figures Step-by-step

Paper • 2412.20631 • Published Dec 30, 2024 • 15

upvoted a collection 4 months ago

Document AI

All the papers that can fundementally help in creating a true open-source processing pipeline. • 1 item • Updated Nov 11, 2024 • 1

upvoted a paper 4 months ago

Focus Anywhere for Fine-grained Multi-page Document Understanding

Paper • 2405.14295 • Published May 23, 2024 • 1

upvoted a collection 4 months ago

PixMo

A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog • 10 items • Updated 20 days ago • 68

upvoted a paper 7 months ago

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published Sep 3, 2024 • 84

upvoted a paper 9 months ago

DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation

Paper • 2406.16855 • Published Jun 24, 2024 • 56

upvoted a paper 12 months ago

OneChart: Purify the Chart Structural Extraction via One Auxiliary Token

Paper • 2404.09987 • Published Apr 15, 2024 • 2

upvoted a paper about 1 year ago

Small Language Model Meets with Reinforced Vision Vocabulary

Paper • 2401.12503 • Published Jan 23, 2024 • 32

upvoted 2 papers over 1 year ago

Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models

Paper • 2312.06109 • Published Dec 11, 2023 • 21

Merlin:Empowering Multimodal LLMs with Foresight Minds

Paper • 2312.00589 • Published Nov 30, 2023 • 27