PyTorch Image Models

https://github.com/rwightman/pytorch-image-models

Activity Feed

AI & ML interests

Computer Vision

Recent Activity

rwightman new activity about 7 hours ago

timm/vit_little_patch16_reg4_gap_256.sbb_in1k:Loss exploding to nan

rwightman updated a collection about 14 hours ago

SigLIP 2

rwightman updated a collection about 14 hours ago

SigLIP 2

View all activity

rwightman

in timm/vit_little_patch16_reg4_gap_256.sbb_in1k about 7 hours ago

Loss exploding to nan

#1 opened 4 days ago by

tony0278611

rwightman

updated a collection about 14 hours ago

SigLIP 2

Collection

OpenCLIP and timm SigLIP 2 models • 47 items • Updated about 14 hours ago • 23

merve

posted an update about 15 hours ago

Post

237

we're all sleeping on this OCR model rednote-hilab/dots.ocr 🔥

dots.ocr is a new 3B model with sota performance, support for 100 languages & allowing commercial use! 🤯

single e2e model to extract image, convert tables, formula, and more into markdown 📝
try it MohamedRashad/Dots-OCR

rwightman

updated 2 models 1 day ago

timm/naflexvit_base_patch16_siglip.v2_webli

Image Feature Extraction • Updated 1 day ago • 15

timm/naflexvit_so400m_patch16_siglip.v2_webli

Image Feature Extraction • Updated 1 day ago • 12

rwightman

published 2 models 1 day ago

timm/naflexvit_base_patch16_siglip.v2_webli

Image Feature Extraction • Updated 1 day ago • 15

timm/naflexvit_so400m_patch16_siglip.v2_webli

Image Feature Extraction • Updated 1 day ago • 12

merve

posted an update 1 day ago

Post

413

massive releases and tons of Flux 1. Krea LoRas past week!
here's some of the picks, find more models in collection 🫡 merve/releases-august-2-6890c14248203522b7d0267f

LLMs 💬
> Tencent dropped tencent/Hunyuan-7B-Instruct
> Qwen released Qwen/Qwen3-Coder-30B-A3B-Instruct, 30B MoE with 3B params for coding (OS)

vision/multimodal
> RedNote released rednote-hilab/dots.ocr - 3B OCR model (OS)
> Cohere released CohereLabs/command-a-vision-07-2025 - 112B (dense!) VLM for 6 languages
> StepFun-AI shipped stepfun-ai/step3 - 321B MoE VLM (OS)
> Skywork shipped Skywork/Skywork-UniPic-1.5B - new any-to-any model (image+text → image+text) (OS)