Model Card for Model ID

Model Details

Model Description

This is the model card for a 🧨 diffusers model that has been automatically generated and pushed to the Hub. It represents a UNet2D diffusion model trained for image generation, specifically configured for a 128x128 image size. The model architecture and training parameters are detailed below.

Developed by: [Milu Catalin]
Model type: UNet2D Diffusion Model
Language(s) (NLP): N/A (Image Generation)
Trained on: AMD MI300X
Training epochs: 100
Dataset: huggan/flowers-102-categories
Image Size: 128x128

Uses

Direct Use

This model is intended for direct use in generating 128x128 images, particularly for datasets similar to the "huggan/flowers-102-categories" dataset. It can be used for tasks such as image synthesis, data augmentation, or creative image generation.

Downstream Use [optional]

The model can be fine-tuned for specific image generation tasks, integrated into larger generative AI pipelines, or used as a component in applications requiring image synthesis capabilities.

Out-of-Scope Use

This model might not perform optimally for image generation tasks significantly different from the dataset it was trained on. It is not intended for use in applications requiring high-resolution image generation without further fine-tuning or upscaling techniques. Misuse could include generating misleading or harmful content.

Bias, Risks, and Limitations

This model, like all generative models, may inherit biases present in the training data. The generated images may reflect these biases. The model's performance is limited to the resolution and diversity of the training data. Further testing and evaluation are needed to fully understand its limitations and potential biases.

Recommendations

Users should be aware of the potential biases and limitations of this model. It is recommended to evaluate the model's performance on a diverse set of inputs and to use appropriate techniques for mitigating biases. Further research into the training data and model behavior is encouraged.

How to Get Started with the Model

Use the code below to get started with the model.

from diffusers import UNet2DModel
import torch
from dataclasses import dataclass

@dataclass
class Config:
    image_size = 128
    train_batch_size = 16
    eval_batch_size = 16
    num_epochs = 100
    learning_rate = 1e-4
    lr_warmup_steps = 500
    save_image_epochs = 10
    save_model_epochs = 30
    output_dir = "ddim-flowers-128"
    seed = 36
    dataset_name = "huggan/flowers-102-categories"
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

config = Config()

model = UNet2DModel(
    sample_size=config.image_size,
    in_channels=3,
    out_channels=3,
    layers_per_block=2,
    dropout=0.1,
    block_out_channels=(128, 128, 256, 256, 512, 512),
    down_block_types=(
        "DownBlock2D",
        "DownBlock2D",
        "AttnDownBlock2D",
        "DownBlock2D",
        "AttnDownBlock2D",
        "DownBlock2D",
    ),
    up_block_types=(
        "UpBlock2D",
        "AttnUpBlock2D",
        "UpBlock2D",
        "AttnUpBlock2D",
        "UpBlock2D",
        "UpBlock2D",
    ),
).to(config.device)

# Printing the model summary
total_params = sum(param.numel() for param in model.parameters())
total_size_mb = total_params * 4 / (1024 ** 2)
print("Total Model Parameters:", f"{total_params:,}")
print("Total Model Size: {:.2f} MB".format(total_size_mb))