metadata

license: other
license_name: flux-1-dev-non-commercial-license
license_link: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md
language:
  - en
library_name: diffusers
pipeline_tag: text-to-image
tags:
  - Text-to-Image
  - ControlNet
  - Diffusers
  - Flux.1-dev
  - image-generation
  - Stable Diffusion
base_model: black-forest-labs/FLUX.1-dev

FLUX.1-dev-ControlNet-Union-Pro-2.0

This repository contains an unified ControlNet for FLUX.1-dev model released by Shakker Labs.

Keynotes

In comparison with Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro,

Remove mode embedding. Smaller model size (6.6GB -> 4.0GB).
Improve on canny and pose, better control and aesthetics.
Add support for soft edge. Remove support for tile.

Model Cards

This ControlNet consists of 6 double blocks and 0 single block as the same as Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro. Mode embedding is removed.
We train the model from scratch for 300k steps using a dataset of 20M high-quality general and human images. We train at 512x512 resolution in BFloat16, batch size = 128, learning rate = 2e-5, the guidance is uniformly sampled from [1, 7]. We set the text drop ratio to 0.20.
This model supports multiple control modes, including canny, soft edge, depth, pose, gray.
This model can be jointly used with other ControlNets.

Inference

import torch
from diffusers.utils import load_image
from diffusers import FluxControlNetPipeline, FluxControlNetModel

base_model = 'black-forest-labs/FLUX.1-dev'
controlnet_model_union = 'Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro-2.0'

controlnet = FluxControlNetModel.from_pretrained(controlnet_model_union, torch_dtype=torch.bfloat16)
pipe = FluxControlNetPipeline.from_pretrained(base_model, controlnet=controlnet, torch_dtype=torch.bfloat16)
pipe.to("cuda")

width, height = control_image_depth.size

image = pipe(
    prompt, 
    control_image=control_image,
    width=width,
    height=height,
    controlnet_conditioning_scale=0.7,
    control_guidance_end=0.8,
    num_inference_steps=24, 
    guidance_scale=3.5,
    generator=torch.manual_seed(42),
).images[0]

Recommended Parameters

You can adjust controlnet_conditioning_scale and control_guidance_end for stronger control and better detail preservation.

Canny: use cv2.Canny, controlnet_conditioning_scale=0.7, control_guidance_end=0.8.
Soft Edge: use AnylineDetector, controlnet_conditioning_scale=0.7, control_guidance_end=0.8.
Depth: use depth-anything, controlnet_conditioning_scale=0.8, control_guidance_end=0.8.
Pose: use DWPose, controlnet_conditioning_scale=0.9, control_guidance_end=0.65.
Gray: use cv2.cvtColor, controlnet_conditioning_scale=0.9, control_guidance_end=0.8.

Resources

Acknowledgements

This model is developed by Shakker Labs. The original idea is inspired by xinsir/controlnet-union-sdxl-1.0. All copyright reserved.