Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Yan Shu's picture
11 11 8

Yan Shu

sy1998
Markus-Pobitzer's profile picture csp's profile picture Koyiljon's profile picture
·

AI & ML interests

None yet

Recent Activity

updated a dataset about 1 month ago
sy1998/tempsports
upvoted a paper about 2 months ago
Outline-Guided Object Inpainting with Diffusion Models
upvoted a paper about 2 months ago
When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding
View all activity

Organizations

Long Video Benchmark's profile picture MLVU's profile picture

authored a paper 2 months ago

EarthMind: Towards Multi-Granular and Multi-Sensor Earth Observation with Large Multimodal Models

Paper • 2506.01667 • Published Jun 2 • 21
authored 4 papers 3 months ago

MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding

Paper • 2406.04264 • Published Jun 6, 2024 • 2

Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding

Paper • 2409.14485 • Published Sep 22, 2024 • 2

TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control

Paper • 2410.10133 • Published Oct 14, 2024 • 1

Video-XL-Pro: Reconstructive Token Compression for Extremely Long Video Understanding

Paper • 2503.18478 • Published Mar 24 • 1
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs