Han-Bit Kang's picture

126 29

Han-Bit Kang

hbkang

·

AI & ML interests

ML

Recent Activity

liked a model 2 days ago

MizzenAI/HPSv3

upvoted a paper 2 days ago

HPSv3: Towards Wide-Spectrum Human Preference Score

updated a collection 22 days ago

talking-head-generation

View all activity

Organizations

None yet

upvoted a paper 2 days ago

HPSv3: Towards Wide-Spectrum Human Preference Score

Paper • 2508.03789 • Published 4 days ago • 13

upvoted 3 papers 22 days ago

FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers

Paper • 2507.12956 • Published 23 days ago • 22

FLEXITOKENS: Flexible Tokenization for Evolving Language Models

Paper • 2507.12720 • Published 23 days ago • 8

SpeakerVid-5M: A Large-Scale High-Quality Dataset for Audio-Visual Dyadic Interactive Human Generation

Paper • 2507.09862 • Published 26 days ago • 48

upvoted 3 papers about 1 month ago

Radial Attention: O(nlog n) Sparse Attention with Energy Decay for Long Video Generation

Paper • 2506.19852 • Published Jun 24 • 40

Peccavi: Visual Paraphrase Attack Safe and Distortion Free Image Watermarking Technique for AI-Generated Images

Paper • 2506.22960 • Published Jun 28 • 6

HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling

Paper • 2506.20452 • Published Jun 25 • 18

upvoted 5 papers about 2 months ago

Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion Model

Paper • 2506.15682 • Published Jun 18 • 5

JAFAR: Jack up Any Feature at Any Resolution

Paper • 2506.11136 • Published Jun 10 • 10

The Diffusion Duality

Paper • 2506.10892 • Published Jun 12 • 38

MiniCPM4: Ultra-Efficient LLMs on End Devices

Paper • 2506.07900 • Published Jun 9 • 88

Seeing Voices: Generating A-Roll Video from Audio with Mirage

Paper • 2506.08279 • Published Jun 9 • 28

upvoted a paper 2 months ago

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

Paper • 2506.05209 • Published Jun 5 • 44

upvoted 7 papers 3 months ago

Training-Free Efficient Video Generation via Dynamic Token Carving

Paper • 2505.16864 • Published May 22 • 22

MMaDA: Multimodal Large Diffusion Language Models

Paper • 2505.15809 • Published May 21 • 95

Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8 • 82

Softpick: No Attention Sink, No Massive Activations with Rectified Softmax

Paper • 2504.20966 • Published Apr 29 • 32

Generative AI for Character Animation: A Comprehensive Survey of Techniques, Applications, and Future Directions

Paper • 2504.19056 • Published Apr 27 • 18

Group Downsampling with Equivariant Anti-aliasing

Paper • 2504.17258 • Published Apr 24 • 9

Boosting Generative Image Modeling via Joint Image-Feature Synthesis

Paper • 2504.16064 • Published Apr 22 • 14