RedHat Dog SD3 - Fine-tuned Stable Diffusion 3.5 Model
Model Description
This is a fine-tuned version of Stable Diffusion 3.5 Medium trained using the Dreambooth technique to generate images of a specific Red Hat branded dog character ("rhteddy").
Model Details
- Base Model: stabilityai/stable-diffusion-3.5-medium
- Fine-tuning Method: Dreambooth
- Training Data: 5-10 images of Red Hat dog character
- Training Steps: 800 steps
- Resolution: 512x512 pixels
- Hardware: NVIDIA L40S GPU (40GB memory)
Intended Use
This model is designed for:
- Generating images of the Red Hat dog character in various contexts
- Educational demonstrations of Dreambooth fine-tuning
- Corporate branding and marketing content creation
- Research into personalized diffusion models
Example
import torch
from diffusers import DiffusionPipeline
pipeline = DiffusionPipeline.from_pretrained(
"cfchase/redhat-dog-sd3",
torch_dtype=torch.bfloat16
)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
pipeline.to(device)
# Generate an image
image = pipeline("photo of a rhteddy dog in a park").images[0]
image.save("redhat_dog_park.png")
Recommended Prompts
The model works best with prompts that include the trigger phrase rhteddy dog
:
"photo of a rhteddy dog"
"rhteddy dog sitting in an office"
"rhteddy dog wearing a Red Hat"
"rhteddy dog in a technology conference"
Training Details
Training Configuration
- Instance Prompt: "photo of a rhteddy dog"
- Class Prompt: "a photo of dog"
- Learning Rate: 5e-6
- Batch Size: 1
- Gradient Accumulation Steps: 2
- Optimizer: 8-bit Adam
- Scheduler: Constant
- Prior Preservation: Enabled with 200 class images
Training Environment
- Platform: Red Hat OpenShift AI (RHOAI)
- Framework: Hugging Face Diffusers
- Acceleration: xFormers, gradient checkpointing
Model Architecture
This model inherits the architecture of Stable Diffusion 3.5 Medium:
- Transformer: SD3Transformer2DModel
- VAE: AutoencoderKL
- Text Encoders:
- 2x CLIPTextModelWithProjection
- 1x T5EncoderModel
- Scheduler: FlowMatchEulerDiscreteScheduler
Limitations and Bias
- The model is specifically trained on Red Hat branded imagery and may not generalize well to other contexts
- Training data was limited to a small dataset, which may result in overfitting
- The model inherits any biases present in the base Stable Diffusion 3.5 model
- Performance is optimized for the specific "rhteddy dog" concept and may struggle with significant variations
Training Data
The training data consists of approximately 5-10 high-quality images of the Red Hat dog character, featuring:
- Various poses and angles
- Consistent visual style and branding
- Professional photography quality
- Clear subject focus
Technical Specifications
- Model Size: ~47GB (full precision weights)
- Inference Requirements:
- GPU with 8GB+ VRAM recommended
- CUDA-compatible device
- Python 3.8+
- PyTorch 2.0+
- Diffusers library
License
This model is based on Stable Diffusion 3.5 Medium and is subject to the same licensing terms. Please refer to the original model license for details.
Contact
For questions about this model or the training process, please refer to the Red Hat OpenShift AI documentation or the associated training notebooks.
- Downloads last month
- 38
Model tree for cfchase/redhat-dog-sd3
Base model
stabilityai/stable-diffusion-3.5-medium