xiangan

xiangan

·

http://anxiangsir.com

anxiangsir

AI & ML interests

None yet

Recent Activity

liked a model about 4 hours ago

microsoft/Mage-VL

liked a model about 4 hours ago

moonshotai/Kimi-K3

liked a model about 16 hours ago

microsoft/Mage-ViT

View all activity

Organizations

upvoted a paper 6 days ago

Mage-Flow: An Efficient Native-Resolution Foundation Model for Image Generation and Editing

Paper • 2607.19064 • Published 7 days ago • 72

upvoted a collection 7 days ago

TimeLens2

8 items • Updated about 3 hours ago • 13

upvoted a paper 7 days ago

TimeLens2: Generalist Video Temporal Grounding with Multimodal LLMs

Paper • 2607.17423 • Published 9 days ago • 165

upvoted a collection 27 days ago

LLaVA-OneVision-1.5

5 items • Updated May 6 • 2

upvoted a collection about 1 month ago

MiniMax-M3

4 items • Updated 26 days ago • 12

upvoted 2 papers about 2 months ago

MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling

Paper • 2606.13473 • Published Jun 11 • 93

Benchmarking Visual State Tracking in Multimodal Video Understanding

Paper • 2606.03920 • Published Jun 2 • 53

upvoted a paper 2 months ago

From Pixels to Words -- Towards Native One-Vision Models at Scale

Paper • 2605.28820 • Published May 27 • 76

upvoted a collection 2 months ago

LLaVA-OneVision-2

2 items • Updated May 27 • 2

upvoted 2 papers 2 months ago

LLaVA-OneVision-2: Towards Next-Generation Perceptual Intelligence

Paper • 2605.25979 • Published May 25 • 27

ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning

Paper • 2605.20342 • Published May 19 • 34

upvoted a collection 3 months ago

LLaVA-OneVision-2

2 items • Updated May 20 • 6

upvoted a paper 4 months ago

FileGram: Grounding Agent Personalization in File-System Behavioral Traces

Paper • 2604.04901 • Published Apr 6 • 40

upvoted 2 papers 5 months ago

LLaDA-o: An Effective and Length-Adaptive Omni Diffusion Model

Paper • 2603.01068 • Published Mar 1 • 22

From Pixels to Words -- Towards Native Vision-Language Primitives at Scale

Paper • 2510.14979 • Published Oct 16, 2025 • 70

upvoted an article 5 months ago

Article

NEO-unify: Building Native Multimodal Unified Models End to End

sensenova

•

Mar 5

• 171

upvoted a paper 5 months ago

UniG2U-Bench: Do Unified Models Advance Multimodal Understanding?

Paper • 2603.03241 • Published Mar 3 • 88

upvoted a changelog 5 months ago

Hugging Face Changelog

Public Storage Add-ons

Feb 26

• 170

upvoted a collection 5 months ago

onevision-encoder

4 items • Updated May 6 • 6

upvoted a paper 5 months ago

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

Paper • 2602.12279 • Published Feb 12 • 20