metadata
base_model: stabilityai/stable-diffusion-2-base
library_name: diffusers
license: creativeml-openrail-m
tags:
- stable-diffusion
- stable-diffusion-diffusers
- text-to-image
- diffusers
- diffusers-training
- stable-diffusion
- stable-diffusion-diffusers
- text-to-image
- diffusers
- diffusers-training
inference: true
Text-to-image finetuning - jffacevedo/pxla_trained_model
This pipeline was finetuned from stabilityai/stable-diffusion-2-base on the lambdalabs/naruto-blip-captions dataset.
Pipeline usage
You can use the pipeline like so:
import torch
import os
import sys
import numpy as np
import torch_xla.core.xla_model as xm
from time import time
from typing import Tuple
from diffusers import StableDiffusionPipeline
def main(args):
device = xm.xla_device()
model_path = <output_dir>
pipe = StableDiffusionPipeline.from_pretrained(
model_path,
torch_dtype=torch.bfloat16
)
pipe.to(device)
prompt = ["A naruto with green eyes and red legs."]
image = pipe(prompt, num_inference_steps=30, guidance_scale=7.5).images[0]
image.save("naruto.png")
if __name__ == '__main__':
main()
Training info
These are the key hyperparameters used during training:
- Steps: 50
- Learning rate: 1e-06
- Batch size: 32
- Image resolution: 512
- Mixed-precision: bf16
Intended uses & limitations
How to use
# TODO: add an example code snippet for running this diffusion pipeline
Limitations and bias
[TODO: provide examples of latent issues and potential remediations]
Training details
[TODO: describe the data used to train the model]