Note

  • Weakness in Complex Scene Creation: Due to limitation of data, our model has limited capabilities in generating complex scenes, text, and human hands.
  • Enhancing Capabilities: The model’s performance can be improved by increasing the complexity and length of prompts. Below are some examples of prompts and samples.

Model Description

Model Sources

For research purposes, we recommend our generative-models Github repository (https://github.com/NVlabs/Sana), which is more suitable for both training and inference and for which most advanced diffusion sampler like Flow-DPM-Solver is integrated. MIT Han-Lab provides free Sana inference.

# pip install git+https://github.com/huggingface/diffusers
# pip install transformer
import torch
from diffusers import SanaPAGPipeline

pipe = SanaPAGPipeline.from_pretrained(
  "kpsss34/SANA600.fp8_landscape_V1",
  torch_dtype=torch.float16,
)
pipe.to("cuda")

pipe.text_encoder.to(torch.bfloat16)
pipe.vae.to(torch.bfloat16)

prompt = 'A cute 🐼 eating 🎋, ink drawing style'
image = pipe(
    prompt=prompt,
    height=1024,
    width=1024,
    guidance_scale=5.0,
    pag_scale=2.0,
    num_inference_steps=20,
    generator=torch.Generator(device="cuda").manual_seed(42),
)[0]
image[0].save('sana.png')
Downloads last month
0
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kpsss34/SANA600.fp8_landscape_V1

Unable to build the model tree, the base model loops to the model itself. Learn more.

Collection including kpsss34/SANA600.fp8_landscape_V1