RedHat Dog SD3 - Fine-tuned Stable Diffusion 3.5 Model

Model Description

This is a fine-tuned version of Stable Diffusion 3.5 Medium trained using the Dreambooth technique to generate images of a specific Red Hat branded dog character ("rhteddy").

Model Details

  • Base Model: stabilityai/stable-diffusion-3.5-medium
  • Fine-tuning Method: Dreambooth
  • Training Data: 5-10 images of Red Hat dog character
  • Training Steps: 800 steps
  • Resolution: 512x512 pixels
  • Hardware: NVIDIA L40S GPU (40GB memory)

Intended Use

This model is designed for:

  • Generating images of the Red Hat dog character in various contexts
  • Educational demonstrations of Dreambooth fine-tuning
  • Corporate branding and marketing content creation
  • Research into personalized diffusion models

Example

import torch
from diffusers import DiffusionPipeline

pipeline = DiffusionPipeline.from_pretrained(
    "cfchase/redhat-dog-sd3",
    torch_dtype=torch.bfloat16
)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
pipeline.to(device)

# Generate an image
image = pipeline("photo of a rhteddy dog in a park").images[0]
image.save("redhat_dog_park.png")

Recommended Prompts

The model works best with prompts that include the trigger phrase rhteddy dog:

  • "photo of a rhteddy dog"
  • "rhteddy dog sitting in an office"
  • "rhteddy dog wearing a Red Hat"
  • "rhteddy dog in a technology conference"

Training Details

Training Configuration

  • Instance Prompt: "photo of a rhteddy dog"
  • Class Prompt: "a photo of dog"
  • Learning Rate: 5e-6
  • Batch Size: 1
  • Gradient Accumulation Steps: 2
  • Optimizer: 8-bit Adam
  • Scheduler: Constant
  • Prior Preservation: Enabled with 200 class images

Training Environment

  • Platform: Red Hat OpenShift AI (RHOAI)
  • Framework: Hugging Face Diffusers
  • Acceleration: xFormers, gradient checkpointing

Model Architecture

This model inherits the architecture of Stable Diffusion 3.5 Medium:

  • Transformer: SD3Transformer2DModel
  • VAE: AutoencoderKL
  • Text Encoders:
    • 2x CLIPTextModelWithProjection
    • 1x T5EncoderModel
  • Scheduler: FlowMatchEulerDiscreteScheduler

Limitations and Bias

  • The model is specifically trained on Red Hat branded imagery and may not generalize well to other contexts
  • Training data was limited to a small dataset, which may result in overfitting
  • The model inherits any biases present in the base Stable Diffusion 3.5 model
  • Performance is optimized for the specific "rhteddy dog" concept and may struggle with significant variations

Training Data

The training data consists of approximately 5-10 high-quality images of the Red Hat dog character, featuring:

  • Various poses and angles
  • Consistent visual style and branding
  • Professional photography quality
  • Clear subject focus

Technical Specifications

  • Model Size: ~47GB (full precision weights)
  • Inference Requirements:
    • GPU with 8GB+ VRAM recommended
    • CUDA-compatible device
    • Python 3.8+
    • PyTorch 2.0+
    • Diffusers library

License

This model is based on Stable Diffusion 3.5 Medium and is subject to the same licensing terms. Please refer to the original model license for details.

Contact

For questions about this model or the training process, please refer to the Red Hat OpenShift AI documentation or the associated training notebooks.

Downloads last month
38
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for cfchase/redhat-dog-sd3

Finetuned
(20)
this model