metadata

license: creativeml-openrail-m
base_model: stabilityai/stable-diffusion-3-medium-diffusers
tags:
  - stable-diffusion
  - stable-diffusion-diffusers
  - text-to-image
  - diffusers
  - simpletuner
  - lora
  - template:sd-lora
inference: true
widget:
  - text: unconditional (blank prompt)
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_0_0.png
  - text: >-
      A hauntingly eerie illustration of a human skull with fully articulated
      legs, standing upright, jaw slightly ajar revealing teeth, set against a
      stark, dark background, with the skull's pale bones starkly contrasting
      the surrounding shadows, capturing a surreal fusion of human anatomy and
      skeletal structure in a single, chilling composition.
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_1_0.png

sd3-civitai-lora

This is a LoRA derived from stabilityai/stable-diffusion-3-medium-diffusers.

The main validation prompt used during training was:

A hauntingly eerie illustration of a human skull with fully articulated legs, standing upright, jaw slightly ajar revealing teeth, set against a stark, dark background, with the skull's pale bones starkly contrasting the surrounding shadows, capturing a surreal fusion of human anatomy and skeletal structure in a single, chilling composition.

Validation settings

CFG: 7.5
CFG Rescale: 0.0
Steps: 30
Sampler: euler
Seed: 42
Resolution: 1024

Note: The validation settings are not necessarily the same as the training settings.

You can find some example images in the following gallery:

Prompt
unconditional (blank prompt)

Negative Prompt
blurry, cropped, ugly

Prompt
A hauntingly eerie illustration of a human skull with fully articulated legs, standing upright, jaw slightly ajar revealing teeth, set against a stark, dark background, with the skull's pale bones starkly contrasting the surrounding shadows, capturing a surreal fusion of human anatomy and skeletal structure in a single, chilling composition.

Negative Prompt
blurry, cropped, ugly

The text encoder was not trained. You may reuse the base model text encoder for inference.

Training settings

Training epochs: 2
Training steps: 300
Learning rate: 8e-07
Effective batch size: 8
- Micro-batch size: 1
- Gradient accumulation steps: 4
- Number of GPUs: 2
Prediction type: epsilon
Rescaled betas zero SNR: False
Optimizer: AdamW, stochastic bf16
Precision: Pure BF16
Xformers: Not used
LoRA Rank: 16
LoRA Alpha: 16
LoRA Dropout: 0.1
LoRA initialisation style: default

Datasets

civitai-images

Repeats: 0
Total number of images: ~872
Total number of aspect buckets: 1
Resolution: 1024 px
Cropped: True
Crop style: center
Crop aspect: square

Inference

First, make sure you have the latest version of diffusers

pip install git+https://github.com/huggingface/diffusers.git

or (if the newer version is released)

pip install diffusers==0.30.0

Then, use this code here

import torch
from diffusers import StableDiffusion3Pipeline as DiffusionPipeline

model_id = "stabilityai/stable-diffusion-3-medium-diffusers"
adapter_id = "MohamedRashad/sd3-civitai-lora"

prompt = "A digital artwork capturing a serene moment between a young woman and a black cat, set against a cloudy blue sky with white fluffy clouds and delicate flowers at the bottom. The woman, with her long, wavy pink hair, is elegantly dressed in a blue top featuring a collar, complemented by a flower accessory and a heart-shaped earring. Her striking blue eyes reflect a sense of calm and connection. The black cat, with its glossy coat and a distinctive purple collar, shares a close, affectionate gaze with the woman. The scene is bathed in soft, natural light, with the sun casting a warm, inviting glow on the woman's face, creating a harmonious contrast with the cool, tranquil hues of the sky. The image embodies the essence of 1980s anime, characterized by its pastel color palette and an overall atmosphere of beauty and peacefulness. The artwork is meticulously rendered, showcasing the intricate details of the characters' features, the lush textures of their hair and fur, and the subtle nuances of the environment, all contributing to a visually captivating and emotionally resonant scene."
negative_prompt = "blurry, cropped, ugly"

pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to(
    "cuda"
)
pipeline.load_lora_weights(adapter_id)
image = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    num_inference_steps=30,
    generator=torch.Generator(device="cuda").manual_seed(1641421826),
    width=1024,
    height=1024,
    guidance_scale=7.5,
).images[0]
image.save("output.png", format="PNG")