Yuxuan Wang's picture

14 9 3

Yuxuan Wang

ColorfulAI

·

https://patrick-tssn.github.io/

patrick-tssn

AI & ML interests

Multimodal Learning

Recent Activity

authored a paper 10 days ago

Qwen3-VL Technical Report

authored a paper 29 days ago

Qwen3-Omni Technical Report

authored a paper 29 days ago

OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs

View all activity

Organizations

authored a paper 10 days ago

Qwen3-VL Technical Report

Paper • 2511.21631 • Published 29 days ago • 139

authored 4 papers 29 days ago

Qwen3-Omni Technical Report

Paper • 2509.17765 • Published Sep 22 • 142

OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs

Paper • 2510.10689 • Published Oct 12 • 46

V-HUB: A Visual-Centric Humor Understanding Benchmark for Video LLMs

Paper • 2509.25773 • Published Sep 30

Omni-Captioner: Data Pipeline, Models, and Benchmark for Omni Detailed Perception

Paper • 2510.12720 • Published Oct 14 • 1

New activity in bigai-nlco/VideoHallucer 2 months ago

remove duplicate data in temporal.json

#3 opened 2 months ago by

upvoted 2 papers 7 months ago

Discrete Markov Bridge

Paper • 2505.19752 • Published May 26 • 17

Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space

Paper • 2505.13308 • Published May 19 • 27

authored a paper 7 months ago

Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space

Paper • 2505.13308 • Published May 19 • 27

updated a dataset 8 months ago

ColorfulAI/MoviePuzzle

Viewer • Updated May 14 • 1 • 12

published a dataset 8 months ago

ColorfulAI/MoviePuzzle

Viewer • Updated May 14 • 1 • 12

New activity in ColorfulAI/M4-IT 9 months ago

Update dataset card with OmniMMI information

#1 opened 9 months ago by

New activity in bigai-nlco/OmniMMI 9 months ago

Add task category, link to code

#2 opened 9 months ago by

New activity in ColorfulAI/M4-Audio-LongVA-7B-Qwen2 9 months ago

Add pipeline tag, library name, paper link and Github link

#1 opened 9 months ago by

New activity in ColorfulAI/M4-LongVA-7B-Qwen2 9 months ago

Add pipeline tag, library name, link to paper and project page

#1 opened 9 months ago by

authored a paper 9 months ago

OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts

Paper • 2503.22952 • Published Mar 29 • 17

updated a model 9 months ago

ColorfulAI/OpenOmni-8B-Llama3-Omni

9B • Updated Apr 2 • 5 • 1

upvoted a paper 9 months ago

OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts

Paper • 2503.22952 • Published Mar 29 • 17

commented a paper 9 months ago

OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts

Paper • 2503.22952 • Published Mar 29 • 17 •

updated a model 9 months ago

ColorfulAI/OpenOmni-7B-Qwen2-Omni

9B • Updated Apr 2 • 5 • 1