Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Inui's picture
6 158 188

Inui

Norm
Pbertinert's profile picture LighterDarkness's profile picture 21world's profile picture
·
https://normxu.github.io/

AI & ML interests

Video Diffusion; Large Language Model; Object Detection; OCR

Recent Activity

upvoted a paper about 6 hours ago
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation
upvoted a paper 5 days ago
One-shot Entropy Minimization
upvoted a paper 13 days ago
MMaDA: Multimodal Large Diffusion Language Models
View all activity

Organizations

Social Post Explorers's profile picture Hugging Face Discord Community's profile picture

Collections 9

VAE
  • WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model

    Paper • 2411.17459 • Published Nov 26, 2024 • 11
  • MAGVIT: Masked Generative Video Transformer

    Paper • 2212.05199 • Published Dec 10, 2022
  • Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation

    Paper • 2310.05737 • Published Oct 9, 2023 • 4
  • Finite Scalar Quantization: VQ-VAE Made Simple

    Paper • 2309.15505 • Published Sep 27, 2023 • 22
Video2Video
  • Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations

    Paper • 2410.10792 • Published Oct 14, 2024 • 31

Papers 1

arxiv:2504.07491

models 2

Norm/nougat-latex-base

Image-to-Text • Updated Feb 26, 2024 • 2.91k • 79

Norm/ERNIE-Layout-Pytorch

Updated Nov 14, 2023 • 2.27k • 16

datasets 0

None public yet
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs