2 60 26

Xing Yun

xing0047

xing0047

AI & ML interests

Computer Vision

Recent Activity

upvoted a paper 13 days ago

Discrete Diffusion in Large Language and Multimodal Models: A Survey

upvoted a paper 19 days ago

Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning

upvoted a paper 20 days ago

MiniCPM4: Ultra-Efficient LLMs on End Devices

View all activity

Organizations

upvoted a paper 13 days ago

Discrete Diffusion in Large Language and Multimodal Models: A Survey

Paper • 2506.13759 • Published 13 days ago • 41

upvoted a paper 19 days ago

Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning

Paper • 2506.07044 • Published 22 days ago • 105

upvoted 2 papers 20 days ago

MiniCPM4: Ultra-Efficient LLMs on End Devices

Paper • 2506.07900 • Published 21 days ago • 81

Reinforcement Pre-Training

Paper • 2506.08007 • Published 20 days ago • 235

upvoted 2 papers about 1 month ago

Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning

Paper • 2505.15966 • Published May 21 • 51

Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Paper • 2505.10554 • Published May 15 • 119

published a dataset about 2 months ago

xing0047/CMM

Viewer • Updated May 15 • 2.4k • 17

updated a dataset about 2 months ago

xing0047/CMM

Viewer • Updated May 15 • 2.4k • 17

upvoted 2 papers about 2 months ago

TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22 • 116

Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models

Paper • 2505.04921 • Published May 8 • 178

upvoted 3 papers 2 months ago

VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning

Paper • 2503.13444 • Published Mar 17 • 16

VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models

Paper • 2504.15279 • Published Apr 21 • 75

Packing Input Frame Context in Next-Frame Prediction Models for Video Generation

Paper • 2504.12626 • Published Apr 17 • 52

upvoted 2 papers 3 months ago

PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters

Paper • 2504.08791 • Published Apr 7 • 133

Kimi-VL Technical Report

Paper • 2504.07491 • Published Apr 10 • 130

liked a model 3 months ago

Qwen/Qwen2-VL-2B-Instruct

Image-Text-to-Text • 2B • Updated Jan 12 • 1.11M • 429

upvoted 2 papers 3 months ago

One-Minute Video Generation with Test-Time Training

Paper • 2504.05298 • Published Apr 7 • 106

URECA: Unique Region Caption Anything

Paper • 2504.05305 • Published Apr 7 • 36

liked a model 3 months ago

Qwen/Qwen2.5-VL-32B-Instruct

Image-Text-to-Text • 33B • Updated Apr 14 • 419k • • 396

upvoted a paper 3 months ago

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Paper • 2503.12605 • Published Mar 16 • 35

Xing Yun

AI & ML interests

Recent Activity

Organizations

xing0047's activity