RedHat Dog SD3 - Fine-tuned Stable Diffusion 3.5 Model

Model Description

This is a fine-tuned version of Stable Diffusion 3.5 Medium trained using the Dreambooth technique to generate images of a specific Red Hat branded dog character ("rhteddy").

Model Details

Base Model: stabilityai/stable-diffusion-3.5-medium
Fine-tuning Method: Dreambooth
Training Data: 5-10 images of Red Hat dog character
Training Steps: 800 steps
Resolution: 512x512 pixels
Hardware: NVIDIA L40S GPU (40GB memory)

Intended Use

This model is designed for:

Generating images of the Red Hat dog character in various contexts
Educational demonstrations of Dreambooth fine-tuning
Corporate branding and marketing content creation
Research into personalized diffusion models

Example

import torch
from diffusers import DiffusionPipeline

pipeline = DiffusionPipeline.from_pretrained(
    "cfchase/redhat-dog-sd3",
    torch_dtype=torch.bfloat16
)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
pipeline.to(device)

# Generate an image
image = pipeline("photo of a rhteddy dog in a park").images[0]
image.save("redhat_dog_park.png")

Recommended Prompts

The model works best with prompts that include the trigger phrase rhteddy dog:

"photo of a rhteddy dog"
"rhteddy dog sitting in an office"
"rhteddy dog wearing a Red Hat"
"rhteddy dog in a technology conference"

Training Details

Training Configuration

Instance Prompt: "photo of a rhteddy dog"
Class Prompt: "a photo of dog"
Learning Rate: 5e-6
Batch Size: 1
Gradient Accumulation Steps: 2
Optimizer: 8-bit Adam
Scheduler: Constant
Prior Preservation: Enabled with 200 class images

Training Environment

Platform: Red Hat OpenShift AI (RHOAI)
Framework: Hugging Face Diffusers
Acceleration: xFormers, gradient checkpointing

Model Architecture

This model inherits the architecture of Stable Diffusion 3.5 Medium:

Transformer: SD3Transformer2DModel
VAE: AutoencoderKL
Text Encoders:
- 2x CLIPTextModelWithProjection
- 1x T5EncoderModel
Scheduler: FlowMatchEulerDiscreteScheduler

Limitations and Bias

The model is specifically trained on Red Hat branded imagery and may not generalize well to other contexts
Training data was limited to a small dataset, which may result in overfitting
The model inherits any biases present in the base Stable Diffusion 3.5 model
Performance is optimized for the specific "rhteddy dog" concept and may struggle with significant variations

Training Data

The training data consists of approximately 5-10 high-quality images of the Red Hat dog character, featuring:

Various poses and angles
Consistent visual style and branding
Professional photography quality
Clear subject focus

Technical Specifications

Model Size: ~47GB (full precision weights)
Inference Requirements:
- GPU with 8GB+ VRAM recommended
- CUDA-compatible device
- Python 3.8+
- PyTorch 2.0+
- Diffusers library

License

This model is based on Stable Diffusion 3.5 Medium and is subject to the same licensing terms. Please refer to the original model license for details.

Contact

For questions about this model or the training process, please refer to the Red Hat OpenShift AI documentation or the associated training notebooks.

cfchase
/

redhat-dog-sd3