metadata

base_model:
  - OnomaAIResearch/Illustrious-xl-early-release-v0
datasets:
  - ShinoharaHare/Danbooru-2024-Filtered-1M
language:
  - en
library_name: diffusers
license: openrail++
pipeline_tag: text-to-image
tags:
  - anime
  - art
  - stable-diffusion
  - stable-diffusion-xl

WAI-NSFW-illustrious-SDXL-V14.0-V-Prediction

An unofficial V-Prediction version of WAI-NSFW-illustrious-SDXL-V14.0.

Task	Model
Generation	WAI-NSFW-illustrious-SDXL-V14.0-V-Prediction
Inpainting	Waifu-Inpaint-XL
Colorizing	Waifu-Colorize-XL

Overview

WAI-NSFW-illustrious-SDXL-V14.0-V-Prediction is an anime-themed SDXL model. It is a further fine-tuned version of WAI-NSFW-illustrious-SDXL-V14.0. This refinement transitions the training objective from Epsilon to V-Prediction, incorporating with Zero Terminal Signal-to-Noise Ratio (SNR).

Note that this model was distilled from itself. Therefore, compared to the Epsilon version, it may not exhibit significant advantages in its current state. However, downstream tasks and further training can use this model as a foundation to leverage the benefits of V-Prediction, potentially achieving better results.

Model Details

Developed by: ShinoharaHare
Model type: Diffusion-based text-to-image generative model
Language(s) (NLP): English
License: CreativeML Open RAIL++-M
Finetuned from: WAI-NSFW-illustrious-SDXL-V14.0

🧨 Diffusers

import torch
from diffusers import StableDiffusionXLPipeline

pipeline = StableDiffusionXLPipeline.from_pretrained(
    'ShinoharaHare/WAI-NSFW-illustrious-SDXL-V14.0-V-Prediction',
    torch_dtype=torch.half
)
pipeline.to('cuda')

prompt = '1girl, hatsune miku, vocaloid, blue eyes, blue hair, bowl, can, chopsticks, collared shirt, detached sleeves, eating, elbow rest, fish (food), food, holding, holding chopsticks, katsudon (food), long hair, long sleeves, looking at viewer, meal, nail polish, necktie, noodles, onigiri, plate, ramen, sashimi, shirt, shrimp, shrimp tempura, sleeveless, sleeveless shirt, solo, spring onion, tempura, twintails'

image = pipeline(
    prompt,
    width=832,
    height=1216,
    guidance_scale=5.0,
    num_inference_steps=28
).images[0]
image.show()