File size: 3,927 Bytes
098ac30 680e889 098ac30 4efa203 098ac30 4efa203 098ac30 4efa203 098ac30 4efa203 098ac30 4efa203 098ac30 4efa203 098ac30 06d699c 098ac30 06d699c 098ac30 06d699c 098ac30 06d699c 098ac30 06d699c 098ac30 06d699c 098ac30 06d699c 098ac30 06d699c bae6eb9 098ac30 06d699c 098ac30 06d699c 098ac30 06d699c 098ac30 06d699c 098ac30 06d699c 098ac30 06d699c 098ac30 06d699c 098ac30 06d699c 098ac30 06d699c 098ac30 06d699c 098ac30 06d699c 098ac30 06d699c 098ac30 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 |
---
license: other
base_model: stabilityai/stable-diffusion-3.5-medium
tags:
- stable-diffusion
- stable-diffusion-diffusers
- text-to-image
- diffusers
- dreambooth
- redhat
- corporate-branding
- fine-tuned
library_name: diffusers
pipeline_tag: text-to-image
---
# RedHat Dog SD3 - Fine-tuned Stable Diffusion 3.5 Model
## Model Description
This is a fine-tuned version of [Stable Diffusion 3.5 Medium](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium) trained using the Dreambooth technique to generate images of a specific Red Hat branded dog character ("rhteddy").
## Model Details
- **Base Model**: stabilityai/stable-diffusion-3.5-medium
- **Fine-tuning Method**: Dreambooth
- **Training Data**: 5-10 images of Red Hat dog character
- **Training Steps**: 800 steps
- **Resolution**: 512x512 pixels
- **Hardware**: NVIDIA L40S GPU (40GB memory)
## Intended Use
This model is designed for:
- Generating images of the Red Hat dog character in various contexts
- Educational demonstrations of Dreambooth fine-tuning
- Corporate branding and marketing content creation
- Research into personalized diffusion models
## Example
```python
import torch
from diffusers import DiffusionPipeline
pipeline = DiffusionPipeline.from_pretrained(
"cfchase/redhat-dog-sd3",
torch_dtype=torch.bfloat16
)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
pipeline.to(device)
# Generate an image
image = pipeline("photo of a rhteddy dog in a park").images[0]
image.save("redhat_dog_park.png")
```
### Recommended Prompts
The model works best with prompts that include the trigger phrase `rhteddy dog`:
- `"photo of a rhteddy dog"`
- `"rhteddy dog sitting in an office"`
- `"rhteddy dog wearing a Red Hat"`
- `"rhteddy dog in a technology conference"`
## Training Details
### Training Configuration
- **Instance Prompt**: "photo of a rhteddy dog"
- **Class Prompt**: "a photo of dog"
- **Learning Rate**: 5e-6
- **Batch Size**: 1
- **Gradient Accumulation Steps**: 2
- **Optimizer**: 8-bit Adam
- **Scheduler**: Constant
- **Prior Preservation**: Enabled with 200 class images
### Training Environment
- **Platform**: Red Hat OpenShift AI (RHOAI)
- **Framework**: Hugging Face Diffusers
- **Acceleration**: xFormers, gradient checkpointing
## Model Architecture
This model inherits the architecture of Stable Diffusion 3.5 Medium:
- **Transformer**: SD3Transformer2DModel
- **VAE**: AutoencoderKL
- **Text Encoders**:
- 2x CLIPTextModelWithProjection
- 1x T5EncoderModel
- **Scheduler**: FlowMatchEulerDiscreteScheduler
## Limitations and Bias
- The model is specifically trained on Red Hat branded imagery and may not generalize well to other contexts
- Training data was limited to a small dataset, which may result in overfitting
- The model inherits any biases present in the base Stable Diffusion 3.5 model
- Performance is optimized for the specific "rhteddy dog" concept and may struggle with significant variations
## Training Data
The training data consists of approximately 5-10 high-quality images of the Red Hat dog character, featuring:
- Various poses and angles
- Consistent visual style and branding
- Professional photography quality
- Clear subject focus
## Technical Specifications
- **Model Size**: ~47GB (full precision weights)
- **Inference Requirements**:
- GPU with 8GB+ VRAM recommended
- CUDA-compatible device
- Python 3.8+
- PyTorch 2.0+
- Diffusers library
## License
This model is based on Stable Diffusion 3.5 Medium and is subject to the same licensing terms. Please refer to the [original model license](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium) for details.
## Contact
For questions about this model or the training process, please refer to the [Red Hat OpenShift AI documentation](https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed) or the associated training notebooks. |