File size: 3,927 Bytes
098ac30
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
680e889
098ac30
 
 
 
 
 
 
 
 
4efa203
098ac30
 
 
4efa203
098ac30
4efa203
098ac30
4efa203
098ac30
4efa203
 
 
098ac30
 
4efa203
098ac30
06d699c
 
098ac30
06d699c
098ac30
06d699c
098ac30
 
 
 
06d699c
098ac30
06d699c
098ac30
06d699c
098ac30
 
 
 
 
 
 
 
06d699c
098ac30
06d699c
bae6eb9
098ac30
 
06d699c
098ac30
06d699c
098ac30
 
 
 
 
 
 
06d699c
098ac30
06d699c
098ac30
 
 
 
06d699c
098ac30
06d699c
098ac30
 
 
 
 
06d699c
098ac30
06d699c
098ac30
 
 
 
 
 
 
06d699c
098ac30
06d699c
098ac30
06d699c
098ac30
06d699c
098ac30
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
---
license: other
base_model: stabilityai/stable-diffusion-3.5-medium
tags:
- stable-diffusion
- stable-diffusion-diffusers
- text-to-image
- diffusers
- dreambooth
- redhat
- corporate-branding
- fine-tuned
library_name: diffusers
pipeline_tag: text-to-image
---

# RedHat Dog SD3 - Fine-tuned Stable Diffusion 3.5 Model

## Model Description

This is a fine-tuned version of [Stable Diffusion 3.5 Medium](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium) trained using the Dreambooth technique to generate images of a specific Red Hat branded dog character ("rhteddy").

## Model Details

- **Base Model**: stabilityai/stable-diffusion-3.5-medium
- **Fine-tuning Method**: Dreambooth
- **Training Data**: 5-10 images of Red Hat dog character
- **Training Steps**: 800 steps
- **Resolution**: 512x512 pixels
- **Hardware**: NVIDIA L40S GPU (40GB memory)

## Intended Use

This model is designed for:
- Generating images of the Red Hat dog character in various contexts
- Educational demonstrations of Dreambooth fine-tuning
- Corporate branding and marketing content creation
- Research into personalized diffusion models

## Example

```python
import torch
from diffusers import DiffusionPipeline

pipeline = DiffusionPipeline.from_pretrained(
    "cfchase/redhat-dog-sd3",
    torch_dtype=torch.bfloat16
)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
pipeline.to(device)

# Generate an image
image = pipeline("photo of a rhteddy dog in a park").images[0]
image.save("redhat_dog_park.png")
```

### Recommended Prompts

The model works best with prompts that include the trigger phrase `rhteddy dog`:

- `"photo of a rhteddy dog"`
- `"rhteddy dog sitting in an office"`
- `"rhteddy dog wearing a Red Hat"`
- `"rhteddy dog in a technology conference"`

## Training Details

### Training Configuration

- **Instance Prompt**: "photo of a rhteddy dog"
- **Class Prompt**: "a photo of dog"
- **Learning Rate**: 5e-6
- **Batch Size**: 1
- **Gradient Accumulation Steps**: 2
- **Optimizer**: 8-bit Adam
- **Scheduler**: Constant
- **Prior Preservation**: Enabled with 200 class images

### Training Environment

- **Platform**: Red Hat OpenShift AI (RHOAI)
- **Framework**: Hugging Face Diffusers
- **Acceleration**: xFormers, gradient checkpointing

## Model Architecture

This model inherits the architecture of Stable Diffusion 3.5 Medium:
- **Transformer**: SD3Transformer2DModel
- **VAE**: AutoencoderKL
- **Text Encoders**: 
  - 2x CLIPTextModelWithProjection
  - 1x T5EncoderModel
- **Scheduler**: FlowMatchEulerDiscreteScheduler

## Limitations and Bias

- The model is specifically trained on Red Hat branded imagery and may not generalize well to other contexts
- Training data was limited to a small dataset, which may result in overfitting
- The model inherits any biases present in the base Stable Diffusion 3.5 model
- Performance is optimized for the specific "rhteddy dog" concept and may struggle with significant variations

## Training Data

The training data consists of approximately 5-10 high-quality images of the Red Hat dog character, featuring:
- Various poses and angles
- Consistent visual style and branding
- Professional photography quality
- Clear subject focus

## Technical Specifications

- **Model Size**: ~47GB (full precision weights)
- **Inference Requirements**: 
  - GPU with 8GB+ VRAM recommended
  - CUDA-compatible device
  - Python 3.8+
  - PyTorch 2.0+
  - Diffusers library

## License

This model is based on Stable Diffusion 3.5 Medium and is subject to the same licensing terms. Please refer to the [original model license](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium) for details.

## Contact

For questions about this model or the training process, please refer to the [Red Hat OpenShift AI documentation](https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed) or the associated training notebooks.