Stable Diffusion XL on TPUv5e
Generate images from text prompts with various styles
Generate images from text prompts with various styles
Note SDXL arXiv: https://arxiv.org/abs/2307.01952 GitHub: https://github.com/Stability-AI/generative-models models: - https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0 - https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0
text-to-image
Note SDXL arXiv: https://arxiv.org/abs/2307.01952 GitHub: https://github.com/Stability-AI/generative-models models: - https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0 - https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0
Explore fun LoRAs and generate with SDXL
Generate stunning high quality illusion artwork
Note IllusionDiffusion model: https://huggingface.co/monster-labs/control_v1p_sd15_qrcode_monster
Generate images from text prompts
Note T2I-Adapter SDXL arXiv: https://arxiv.org/abs/2302.08453 GitHub: https://github.com/TencentARC/T2I-Adapter models: https://huggingface.co/TencentARC/t2i-adapter-canny-sdxl-1.0 https://huggingface.co/TencentARC/t2i-adapter-sketch-sdxl-1.0 https://huggingface.co/TencentARC/t2i-adapter-lineart-sdxl-1.0 https://huggingface.co/TencentARC/t2i-adapter-depth-midas-sdxl-1.0 https://huggingface.co/TencentARC/t2i-adapter-depth-zoe-sdxl-1.0 https://huggingface.co/TencentARC/t2i-adapter-openpose-sdxl-1.0
Note Doodly (T2I-Adapter SDXL) arXiv: https://arxiv.org/abs/2302.08453 GitHub: https://github.com/TencentARC/T2I-Adapter model: https://huggingface.co/TencentARC/t2i-adapter-sketch-sdxl-1.0
Note Hotshot-XL GitHub: https://github.com/hotshotco/Hotshot-XL models: - https://huggingface.co/hotshotco/Hotshot-XL - https://huggingface.co/hotshotco/SDXL-512
Note DreamGaussian arXiv: https://arxiv.org/abs/2309.16653 GitHub: https://github.com/dreamgaussian/dreamgaussian Project page: https://dreamgaussian.github.io/
Inpaint images using prompts
Note SDXL inpainting model: https://huggingface.co/diffusers/stable-diffusion-xl-1.0-inpainting-0.1
Note FreeU https://arxiv.org/abs/2309.11497 https://github.com/ChenyangSi/FreeU https://chenyangsi.top/FreeU/
Replace background in images with custom scenes
Edit images using text instructions
Create and edit images using DDPM and SEGA techniques
QR Code AI Art Generator Blend QR codes with AI Art
Note Breathing New Life into 3D Assets with Generative Repainting BMVC 2023 https://arxiv.org/abs/2309.08523 https://github.com/kongdai123/repainting_3d_assets https://www.obukhov.ai/repainting_3d_assets
Note InstaFlow https://arxiv.org/abs/2309.06380 https://github.com/gnobitab/InstaFlow
Generate unique images by combining random LoRA models
Generate detailed images using prompts and LoRA models
Note DiffBIR https://arxiv.org/abs/2308.15070 https://github.com/XPixelGroup/DiffBIR https://0x3f3f3f3fun.github.io/projects/diffbir/
Note DenseDiffusion ICCV 2023 arXiv: https://arxiv.org/abs/2308.12964 GitHub: https://github.com/naver-ai/DenseDiffusion
Note StableVideo ICCV 2023 arXiv: https://arxiv.org/abs/2308.09592 GitHub: https://github.com/rese1f/stablevideo
Note InstructCV https://arxiv.org/abs/2310.00390 https://github.com/AlaaLab/InstructCV
Note Animagine XL model: https://huggingface.co/Linaqruf/animagine-xl
Note Fooocus GitHub: https://github.com/lllyasviel/Fooocus
Generate a video waveform from text-based audio descriptions
Note AudioLDM 2 https://arxiv.org/abs/2308.05734 https://github.com/haoheliu/audioldm2 https://audioldm.github.io/audioldm2/
Note TokenFlow https://arxiv.org/abs/2307.10373 https://github.com/omerbt/TokenFlow https://diffusion-tokenflow.github.io/
text-to-image
Note Kandinsky 2.2 date: 2023-07-12 arXiv: https://arxiv.org/abs/2310.03502 GitHub: https://github.com/ai-forever/Kandinsky-2 Blog: https://habr.com/ru/companies/sberbank/articles/747446/
Note PSLD (Posterior Sampling using Latent Diffusion) https://arxiv.org/abs/2307.00619 https://github.com/LituRout/PSLD
Create 3D mesh from a single image
Note One-2-3-45 https://arxiv.org/abs/2306.16928 https://github.com/One-2-3-45/One-2-3-45 https://one-2-3-45.github.io/
text-to-video
Note zeroscope v2 models: - https://huggingface.co/cerspense/zeroscope_v2_576w - https://huggingface.co/cerspense/zeroscope_v2_XL
Note zeroscope v2 models: - https://huggingface.co/cerspense/zeroscope_v2_576w - https://huggingface.co/cerspense/zeroscope_v2_XL
Note Rerender A Video SIGGRAPH 2023 https://arxiv.org/abs/2306.07954 https://github.com/williamyang1991/Rerender_A_Video https://www.mmlab-ntu.com/project/rerender/
Note Wuerstchen arXiv: https://arxiv.org/abs/2306.00637 GitHub: https://github.com/dome272/Wuerstchen model: https://huggingface.co/warp-ai/wuerstchen
Note GlyphControl NeurIPS 2023 https://arxiv.org/pdf/2305.18259 https://github.com/AIGText/GlyphControl-release
Note Prompt-Free Diffusion https://arxiv.org/abs/2305.16223 https://github.com/SHI-Labs/Prompt-Free-Diffusion
Note NeTI (A Neural Space-Time Representation for Text-to-Image Personalization) SIGGRAPH 2023 https://arxiv.org/abs/2305.15391 https://github.com/NeuralTextualInversion/NeTI https://neuraltextualinversion.github.io/NeTI/
Generate images based on text prompts
Note BLIP-Diffusion https://arxiv.org/abs/2305.14720 https://github.com/salesforce/LAVIS/tree/main/projects/blip-diffusion https://dxli94.github.io/BLIP-Diffusion-website/
Note Control-A-Video https://arxiv.org/abs/2305.13840 https://github.com/Weifeng-Chen/control-a-video https://controlavideo.github.io/
Note ControlVideo https://arxiv.org/abs/2305.13077 https://github.com/YBYBZhang/ControlVideo https://controlvideov1.github.io/
Note ControlVideo https://arxiv.org/abs/2305.13077 https://github.com/YBYBZhang/ControlVideo https://controlvideov1.github.io/
Note TextDiffuser NeurIPS 2023 https://arxiv.org/abs/2305.10855 https://github.com/microsoft/unilm/tree/master/textdiffuser https://jingyechen.github.io/textdiffuser/
Note LDM3D CVPRW 2023 arXiv: https://arxiv.org/abs/2305.10853 model: https://huggingface.co/Intel/ldm3d-pano
Note CoAdapter GitHub: https://github.com/TencentARC/T2I-Adapter
text-to-3D & image-to-3D
Note Shap-E arXiv: https://arxiv.org/abs/2305.02463 GitHub: https://github.com/openai/shap-e models: - https://huggingface.co/openai/shap-e - https://huggingface.co/openai/shap-e-img2img
Note PickScore https://arxiv.org/abs/2305.01569 https://github.com/yuvalkirstain/PickScore
Note IF GitHub: https://github.com/deep-floyd/IF models: - https://huggingface.co/DeepFloyd/IF-I-XL-v1.0 - https://huggingface.co/DeepFloyd/IF-I-L-v1.0 - https://huggingface.co/DeepFloyd/IF-I-M-v1.0 - https://huggingface.co/DeepFloyd/IF-II-L-v1.0 - https://huggingface.co/DeepFloyd/IF-II-M-v1.0
Create detailed images from sketches and other inputs
Note ControlNet ICCV 2023 arXiv: https://arxiv.org/abs/2302.05543 GitHub: - https://github.com/lllyasviel/ControlNet - https://github.com/lllyasviel/ControlNet-v1-1-nightly
Note Rich Text to Image ICCV 2023 arXiv: https://arxiv.org/abs/2304.06720 GitHub: https://github.com/SongweiGe/rich-text-to-image Project page: https://rich-text-to-image.github.io/
Transform images according to a text description
Note An Edit Friendly DDPM Noise Space: Inversion and Manipulations https://arxiv.org/abs/2304.06140 https://github.com/inbarhub/DDPM_inversion https://inbarhub.github.io/DDPM_inversion/
Note Inst-Inpaint https://arxiv.org/abs/2304.03246 https://github.com/abyildirim/inst-inpaint http://instinpaint.abyildirim.com/
Generate 3D human motion videos from text prompts
Note ReMoDiffuse ICCV 2023 https://arxiv.org/abs/2304.01116 https://github.com/mingyuan-zhang/ReMoDiffuse https://mingyuan-zhang.github.io/projects/ReMoDiffuse.html
text-to-image
Note Kandinsky 2.1 date: 2023-04-04 arXiv: https://arxiv.org/abs/2310.03502 GitHub: https://github.com/ai-forever/Kandinsky-2 Blog: https://habr.com/ru/companies/sberbank/articles/725282/
Note Kandinsky 2.1 arXiv: https://arxiv.org/abs/2310.03502 GitHub: https://github.com/ai-forever/Kandinsky-2
Note ComfyUI https://github.com/comfyanonymous/ComfyUI
Note VideoCrafter https://github.com/AILab-CVC/VideoCrafter
Note VideoCrafter https://github.com/AILab-CVC/VideoCrafter
Note vid2vid-zero https://arxiv.org/abs/2303.17599 https://github.com/baaivision/vid2vid-zero
Note PAIR-Diffusion https://arxiv.org/abs/2303.17546 https://github.com/Picsart-AI-Research/PAIR-Diffusion
Note Better Aligning Text-to-Image Models with Human Preference ICCV 2023 https://arxiv.org/abs/2303.14420 https://github.com/tgxs002/align_sd https://tgxs002.github.io/align_sd_web/
Note MDT (Masked Diffusion Transformer) https://arxiv.org/abs/2303.14389 https://github.com/sail-sg/MDT
Note Ablating Concepts in Text-to-Image Diffusion Models ICCV 2023 arXiv: https://arxiv.org/abs/2303.13516 GitHub: https://github.com/nupurkmr9/concept-ablation Project page: https://www.cs.cmu.edu/~concept-ablation/
Note ReVersion https://arxiv.org/abs/2303.13495 https://github.com/ziqihuangg/ReVersion https://ziqihuangg.github.io/projects/reversion.html
Note Text2Video-Zero ICCV 2023 https://arxiv.org/abs/2303.13439 https://github.com/Picsart-AI-Research/Text2Video-Zero https://text2video-zero.github.io/
Note SALAD ICCV 2023 arXiv: https://arxiv.org/abs/2303.12236 GitHub: https://github.com/63days/salad Project page: https://salad3d.github.io/
Note Zero-1-to-3 ICCV 2023 https://arxiv.org/abs/2303.11328 https://github.com/cvlab-columbia/zero123 https://zero123.cs.columbia.edu/
Create novel views of images by adjusting angles and zoom
Note Zero-1-to-3 ICCV 2023 https://arxiv.org/abs/2303.11328 https://github.com/cvlab-columbia/zero123 https://zero123.cs.columbia.edu/
Note Local prompt mixing ICCV 2023 https://arxiv.org/abs/2303.11306 https://github.com/orpatashnik/local-prompt-mixing https://orpatashnik.github.io/local-prompt-mixing/
Note SVDiff https://arxiv.org/abs/2303.11305 https://github.com/mkshing/svdiff-pytorch (unofficial)
Note FateZero ICCV 2023 https://arxiv.org/abs/2303.09535 https://github.com/ChenyangQiQi/FateZero https://fate-zero-edit.github.io/
Note 3DFuse https://arxiv.org/abs/2303.07937 https://github.com/KU-CVLAB/3DFuse https://ku-cvlab.github.io/3DFuse/
Note Erasing Concepts from Diffusion Models ICCV 2023 https://arxiv.org/abs/2303.07345 https://github.com/rohitgandikota/erasing https://erasing.baulab.info/
Note ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models CVPR 2023 https://arxiv.org/abs/2303.04803 https://github.com/NVlabs/ODISE
Note Video-P2P https://arxiv.org/abs/2303.04761 https://github.com/ShaoTengLiu/Video-P2P https://video-p2p.github.io/
Note Word-As-Image SIGGRAPH 2023 https://arxiv.org/abs/2303.01818 https://github.com/Shiriluz/Word-As-Image https://wordasimage.github.io/Word-As-Image-Page/
Generate images or text from text or images
Note UniDiffuser https://arxiv.org/abs/2303.06555 https://github.com/thu-ml/unidiffuser
Note ELITE ICCV 2023 https://arxiv.org/abs/2302.13848 https://github.com/csyxwei/ELITE
Create images from various types of annotations
Note ControlNet ICCV 2023 arXiv: https://arxiv.org/abs/2302.05543 GitHub: - https://github.com/lllyasviel/ControlNet - https://github.com/lllyasviel/ControlNet-v1-1-nightly
Note pix2-pix-zero SIGGRAPH 2023 https://arxiv.org/abs/2302.03027 https://github.com/pix2pixzero/pix2pix-zero https://pix2pixzero.github.io/
Note TEXTure https://arxiv.org/abs/2302.01721 https://github.com/TEXTurePaper/TEXTurePaper https://texturepaper.github.io/TEXTurePaper/
Generate images from text prompts with attention guidance
Note Attend-and-Excite SIGGRAPH 2023 arXiv: https://arxiv.org/abs/2301.13826 GitHub: https://github.com/yuval-alaluf/Attend-and-Excite Project page: https://yuval-alaluf.github.io/Attend-and-Excite/
Generate music from text prompts
Note Riffusion GitHub: https://github.com/riffusion/riffusion
Generate animated videos from text prompts
Note Tune-A-Video ICCV 2023 https://github.com/showlab/Tune-A-Video https://arxiv.org/abs/2212.11565 https://tuneavideo.github.io/
Train a custom video model
Note Tune-A-Video ICCV 2023 https://github.com/showlab/Tune-A-Video https://arxiv.org/abs/2212.11565 https://tuneavideo.github.io/
Enhance and colorize low-resolution images
Note DDNM-HQ ICLR 2023 https://arxiv.org/abs/2212.00490 https://github.com/wyhuai/DDNM/tree/main/hq_demo https://wyhuai.github.io/ddnm.io/
Generate animations from images or prompts
Note Plug-and-Play Diffusion Features CVPR 2023 https://arxiv.org/abs/2211.12572 https://github.com/MichalGeyer/plug-and-play https://pnp-diffusion.github.io/
Generate images based on textual prompts with resolution options
Note Multiresolution Textual Inversion NeurIPS 2022 Workshop https://arxiv.org/abs/2211.17115 https://github.com/giannisdaras/multires_textual_inversion
Note SAG (Self-Attention Diffusion Guidance) ICCV 2023 https://arxiv.org/abs/2210.00939 https://github.com/SusungHong/Self-Attention-Guidance https://ku-cvlab.github.io/Self-Attention-Guidance
Generate images from text prompts
Train models with images and text
Train Stable Diffusion with custom images
Generate images from text descriptions
Note Stable Diffusion 2.1 model: https://huggingface.co/stabilityai/stable-diffusion-2
Note LCM (Latent Consistency Models) arXiv: https://arxiv.org/abs/2310.04378 GitHub: https://github.com/luosiallen/latent-consistency-model Project page: https://latent-consistency-models.github.io/ model: https://huggingface.co/Lykon/dreamshaper-7
Note SSD-1B (Segmind Stable Diffusion 1B) model: https://huggingface.co/segmind/SSD-1B
Note Zero123++ arXiv: https://arxiv.org/abs/2310.15110 GitHub: https://github.com/sudo-ai-3d/zero123plus models: - https://huggingface.co/sudo-ai/zero123plus-v1.1 - https://huggingface.co/sudo-ai/controlnet-zp11-depth-v1
Note Zero123++ arXiv: https://arxiv.org/abs/2310.15110 GitHub: https://github.com/sudo-ai-3d/zero123plus models: - https://huggingface.co/sudo-ai/zero123plus-v1.1 - https://huggingface.co/sudo-ai/controlnet-zp11-depth-v1
Generate 3D views from a single image
Note Wonder3D arXiv: https://arxiv.org/abs/2310.15008 GitHub: https://github.com/xxlong0/Wonder3D
Note Pixart-alpha arXiv: https://arxiv.org/abs/2310.00426 GitHub: https://github.com/PixArt-alpha/PixArt-alpha
Note LongerCrafter arXiv: https://arxiv.org/abs/2310.15169 GitHub: https://github.com/arthur-qiu/LongerCrafter Project page: http://haonanqiu.com/projects/FreeNoise.html
Note Stable Diffusion Reference Only arXiv: https://arxiv.org/abs/2311.02343 GitHub: https://github.com/aihao2000/stable-diffusion-reference-only
Generate images from text prompts
Generate images based on prompts and input images
Generate a short video from an image
Note One More Step (OMS) arXiv: GitHub: https://github.com/mhh0318/OneMoreStep Project page: https://jabir-zheng.github.io/OneMoreStep/
Note Cross-Image Attention for Zero-Shot Appearance Transfer arXiv: https://arxiv.org/abs/2311.03335 GitHub: https://github.com/garibida/cross-image-attention
Note Concept Sliders arXiv: https://arxiv.org/abs/2311.12092 GitHub: https://github.com/rohitgandikota/sliders Project page: https://sliders.baulab.info/
Note LucidDreamer arXiv: https://arxiv.org/abs/2311.11284 GitHub: https://github.com/EnVision-Research/LucidDreamer
Edit images using text prompts and concepts
Generate images from text or modify images with prompts
Generate images from text prompts
Generate images from text prompts
text-to-image
Generate images from text prompts with layout planning
Generate images using diffusion models with various styles
Generate images from text prompts
Generate edited video based on text prompts
Teleport target objects to new backgrounds
Generate high-resolution images with text prompts
A demo of OpenDalle V1.1 on a ZERO GPU.
Generate and detect watermarked images from prompts
Generate AI images with your face
Train LoRAs with Ease
Generate images with text and edit existing images
Magnify subject details and enhance image quality
Generate customized photos using real images and prompts
Generate customized face images with styles
Generate personalized images with a face preservation
Generate videos from images and text prompts
Browse and run thousands of community trained LoRAs
Generate an image from description and subject images
Generate images from text prompts
Super-fast image generation on SDX
Real-Time Image Generation with SDXL Lightning
Edit images using prompts and change maps
Generate face videos from text
Generate highly aesthetic images
Official Demo Space for Trajectory Consistency Distillation
Generate high-resolution images from text prompts
Generate images from text and bounding boxes
High-quality virtual try-on ~ Your cyber fitting room
Generate 3D models from text prompts
LayerDiffusion Transparent Image Layer Diffusion Demo
Generate 3D models from single images
Create arts with a brush ποΈ that paints meanings π§
Generate animated videos from text prompts
Edit images by providing prompts and noise settings
Create and share 2K arts in 30s with Animagine XL 3.1
Real-time image-to-image using pix2pix turbo
SDXS 1024 will Release this is based on SDXS-512-0.9
Generate images of a person from a face photo
Create an animated video from audio and a reference image
A Diffusion-free One-Step Visual Perception Generalist Model
Video Editing
Style-Preserving Text-to-Image Generation
Generate and edit images using text prompts
Generate stunning 4K images from text prompts
Generate images from text prompts with advanced settings
Generate images based on various image conditions
Generate images from sketches
Generate images from text prompts
Generate larger images by expanding an existing photo
Text-to-Image InstantStyle with SDXL Lightning 9s generation
High-fidelity Virtual Try-on
Text-to-Image InstantStyle with Hyper SDXL
Generate large images on pre-trained Stable Diffusion models
Generate images from text prompts and reference images
Generate customized portraits using ID images and prompts
SD arch trained from scratch on Creative Commons dataset
Create images with enhanced prompts
Creative Upscaler High-Res Image Generation HiDiffusion SDXL
Generate relit images from your photo
Generate consistent character images from prompts
Generate images from text captions
Generate 3D models from images
Generate images from sketches using ControlNet
Generate images with specific camera settings
Generate images from sketches using ControlNet and SDXL
Generate a cartoon video from two images
Generate images from text prompts
Generate audio from text
Generate images from text prompts
Edit images based on text prompts
Fastest high-quality video diffusion model.
Generate animated videos from configuration files
Generate images from text prompts
Restylize & repose person ID
Generate images using ControlNet with prompts
Generate high-quality images from text prompts
Real-Time Stable Diffusion 3 with Flash Diffusion andTAESD3
Fast multi-prompt generation with Stable Diffusion 3
Video upscaler/restorer
Generate photorealistic images from text
Generate images from prompts using powerful LORA models
Generate a video from a text prompt
Generate customized photos of a person based on an image and prompt
Create detailed images using text prompts and reference images
Create virtual outfits by combining images
Text-to-Video
Multimodal Image-to-Video
Generate images from text prompts
Generate images from text prompts
Edit images based on source and target prompts
Generate images based on input image and prompt
Generate and edit personalized images using a pre-trained model
Generate custom images using LoRA models
Generate art from images using prompts and control modes
Train Free PersonalizΒ° Diff w/ Stochastic Optimal Control
Content-Style Composition (GoGoGo)
Create a video from an image with camera motion
Generate images using LoRA models
Generate images from text prompts
Generate videos from text prompts
Generate customized images using text and an ID image
Easily expand image boundaries
Ultra-high resolution image synthesis
Generate animated characters from images
Text-to-Video
Upscale low-resolution images
Efficient T2V generation
Generate images fast with SD3.5 turbo
Generate images with SD3.5