This model is part of the research work described in "FeatureFusion: Merging Diffusion Models Through Representation Correlations" by Murdock Aubry and James Bona-Landry.
Model Description
Overview
This model is a food specialist based on the Stable Diffusion 1.4 architecture.
Model Details
Base Model: CompVis/stable-diffusion-v1-4
Type: Specialist
Specialization: Food
Training Data: Food shard
Model Architecture: UNet-based diffusion model
Limitations
The model has the same limitations as the base Stable Diffusion model
Best performance is achieved when prompts relate to the model's specialization
May produce unexpected results for concepts outside its training distribution
Training
Training Procedure
Training Data: Pick-a-Pic v1
Training Method: Finetuning of the UNet component while keeping text encoder and VAE frozen
Hyperparameters:
Optimizer: AdamW
Learning rate: 1e-6
Schedule: Cosine with warmup
Training steps: 5 epochs on 1000 data samples
Memory optimization: Gradient accumulation (4 steps), attention slicing, VAE slicing, gradient checkpointing
Citation
If you use this model in your research, please cite:
@article{aubry2024featurefusion,
title={FeatureFusion: Merging Diffusion Models Through Representation Correlations},
author={Aubry, Murdock and Bona-Landry, James},
journal={},
year={2025}
}
---
license: mit
language:
- en
base_model:
- CompVis/stable-diffusion-v1-4
pipeline_tag: text-to-image
---