dolphinium
/

FLUX.1-dev-wikiart-impressionism-v2

@@ -10,12 +10,60 @@ tags:
 library_name: diffusers
 pipeline_tag: text-to-image
 base_model: black-forest-labs/FLUX.1-dev
 ---
 # FLUX.1-dev Impressionism fine-tuning with LoRA
 This is a LoRA fine-tuning of the FLUX.1 model trained on a curated dataset of impressionist paintings from WikiArt.
 ## Dataset
 The model was trained on the [WikiArt Impressionism Curated Dataset](https://huggingface.co/datasets/dolphinium/wikiart-impressionism-curated), which contains 1,000 high-quality Impressionist paintings with the following distribution:
@@ -27,32 +75,19 @@ The model was trained on the [WikiArt Impressionism Curated Dataset](https://hug
 ## Model Details
 - Base Model: [FLUX.1](https://huggingface.co/black-forest-labs/FLUX.1-dev)
 - LoRA Rank: 16
-- Training Steps: 2000
-- Resolution: 512-1024px
-## Usage
-```python
-from diffusers import StableDiffusionPipeline
-import torch
-model_id = "black-forest-labs/FLUX.1-dev"
-lora_model_path = "dolphinium/FLUX.1-dev-wikiart-impressionism-v2"
-pipe = StableDiffusionPipeline.from_pretrained(
-    model_id,
-    torch_dtype=torch.float16
-).to("cuda")
-# Load LoRA weights
-pipe.unet.load_attn_procs(lora_model_path)
-# Generate image
-prompt = "an impressionist style landscape with rolling hills and autumn trees"
-image = pipe(prompt).images[0]
-image.save("impressionist_landscape.png")
-```
 ## License
-This model inherits the license of the base FLUX.1 model and the WikiArt dataset.

 library_name: diffusers
 pipeline_tag: text-to-image
 base_model: black-forest-labs/FLUX.1-dev
+widget:
+- text: >-
+    An impressionist painting portrays a vast landscape with gently rolling
+    hills under a radiant sky. Clusters of autumn trees dot the scene, rendered
+    with loose, expressive brushstrokes and a palette of warm oranges, deep
+    greens, and soft blues, creating a sense of tranquil, natural beauty
+  output:
+    url: images/example_jl6x0209w.png
 ---
 # FLUX.1-dev Impressionism fine-tuning with LoRA
 This is a LoRA fine-tuning of the FLUX.1 model trained on a curated dataset of impressionist paintings from WikiArt.
+## Training Process & Results
+### Training Environment
+- GPU: NVIDIA A100
+- Training Duration: ~1 hour for 1000 steps
+- Training Notebook: [Google Colab Notebook](https://colab.research.google.com/drive/1G9k6iwSGKXmA32ok4zOPijFUFwBAZ9aB?usp=sharing)
+- Training Framework: [AI-Toolkit](https://github.com/ostris/ai-toolkit)
+## Training Progress Visualization
+### Training Progress Grid
+![Training Progress Grid](sample_grid_annotated.png)
+*4x6 grid showing model progression across different prompts (rows) at various training steps (columns: 0, 200, 400, 600, 800, 1000)*
+### Step-by-Step Evolution
+![Training Progress Animation](prompt_0.gif)
+*Evolution of the model's output for the prompt: "An impressionist painting portrays a vast landscape with gently rolling hills under a radiant sky. Clusters of autumn trees dot the scene, rendered with loose, expressive brushstrokes and a palette of warm oranges, deep greens, and soft blues, creating a sense of tranquil, natural beauty" (Steps 0-1000, sampled every 100 steps)*
+### Base vs Fine-tuned
+![Base model vs Fine-tuned](base_vs_fine_tuned.png)
+*Left side is the base model and right side is this fine-tuned model*
+### Current Results & Future Improvements
+The most notable improvements are observed in landscape generation, which can be attributed to:
+- Strong representation of landscapes (30%) in the training dataset
+- Inherent structural similarities in impressionist landscape paintings
+- Clear patterns in color usage and brushstroke techniques
+Future improvements will focus on:
+- Experimenting with different LoRA configurations and ranks
+- Fine-tuning hyperparameters for better convergence
+- Improving caption quality and specificity(current captions may be too complex that model can not capture spesific features)
+- 'content_or_style' paramater on training config is currently set to 'balanced'. I also want to test 'style' parameter for model training.
+- Extending training duration beyond 1000 steps
+- Developing custom training scripts for more granular control
+While the current implementation uses the [AI-Toolkit](https://github.com/ostris/ai-toolkit), future iterations will involve developing custom training scripts to gain deeper insights into model configuration and behavior.
 ## Dataset
 The model was trained on the [WikiArt Impressionism Curated Dataset](https://huggingface.co/datasets/dolphinium/wikiart-impressionism-curated), which contains 1,000 high-quality Impressionist paintings with the following distribution:
 ## Model Details
 - Base Model: [FLUX.1](https://huggingface.co/black-forest-labs/FLUX.1-dev)
 - LoRA Rank: 16
+- Training Steps: 1000
+- Resolution: 512-768-1024px
+You can find detailed training configurations on [config.yaml](config.yaml)
+## Usage
+To run code 4-bit with quantization check out this [Google Colab Notebook](https://colab.research.google.com/drive/1dnCeNGHQSuWACrG95rH4TXPgXwNNdTh-?usp=sharing).
+On Google Colab the cheapest way to run code is acquiring a T4 with high-ram if I am not wrong :)
+Also thanks to providers original notebook to run code 4-bit with quantization.
+[Original Colab Notebook](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/Flux/Run_Flux_on_an_8GB_machine.ipynb) :
 ## License
+This model inherits the license of the base [FLUX.1 model](https://huggingface.co/black-forest-labs/FLUX.1-dev) and the [WikiArt](https://huggingface.co/datasets/huggan/wikiart) dataset.