WonJae Roh's picture

20

WonJae Roh

snuro

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

Improving Autoregressive Image Generation through Coarse-to-Fine Token Prediction

upvoted a paper 17 days ago

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

upvoted a paper 17 days ago

ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

View all activity

Organizations

None yet

snuro's activity

upvoted a paper 4 days ago

Improving Autoregressive Image Generation through Coarse-to-Fine Token Prediction

Paper • 2503.16194 • Published 14 days ago • 8

upvoted 3 papers 17 days ago

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

Paper • 2503.11579 • Published 20 days ago • 18

ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

Paper • 2503.11647 • Published 20 days ago • 125

PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity

Paper • 2503.07677 • Published 25 days ago • 81

upvoted 2 papers 22 days ago

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia

Paper • 2503.07920 • Published 24 days ago • 95

"Principal Components" Enable A New Language of Images

Paper • 2503.08685 • Published 23 days ago • 11

upvoted 12 papers about 1 month ago

VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing

Paper • 2502.17258 • Published Feb 24 • 77

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3 • 74

Predictive Data Selection: The Data That Predicts Is the Data That Teaches

Paper • 2503.00808 • Published Mar 2 • 56

Self-rewarding correction for mathematical reasoning

Paper • 2502.19613 • Published Feb 26 • 82

GHOST 2.0: generative high-fidelity one shot transfer of heads

Paper • 2502.18417 • Published Feb 25 • 65

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Paper • 2502.18411 • Published Feb 25 • 71

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16 • 150

Soundwave: Less is More for Speech-Text Alignment in LLMs

Paper • 2502.12900 • Published Feb 18 • 82

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 178

Learning Getting-Up Policies for Real-World Humanoid Robots

Paper • 2502.12152 • Published Feb 17 • 40

Large Language Diffusion Models

Paper • 2502.09992 • Published Feb 14 • 109

The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding

Paper • 2502.08946 • Published Feb 13 • 192

upvoted 2 papers 7 months ago

Training-free Long Video Generation with Chain of Diffusion Model Experts

Paper • 2408.13423 • Published Aug 24, 2024 • 23

TVG: A Training-free Transition Video Generation Method with Diffusion Models

Paper • 2408.13413 • Published Aug 24, 2024 • 14