You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

By clicking "Agree", you agree to the License Agreement and acknowledge Stability AI's Privacy Policy.

Log in or Sign Up to review the conditions and access this model content.

Stable Diffusion 3.5 Large ControlNet TensorRT

Introduction

This repository hosts the TensorRT-optimized version of Stable Diffusion 3.5 Large ControlNets, developed in collaboration between Stability AI and NVIDIA. This implementation leverages NVIDIA's TensorRT deep learning inference library to deliver significant performance improvements while maintaining the exceptional image quality of the original model.

Stable Diffusion 3.5 Large is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency. The TensorRT optimization makes these capabilities accessible for production deployment and real-time applications.

The following control types are available:

  • Canny - Use a Canny edge map to guide the structure of the generated image. This is especially useful for illustrations, but works with all styles.

  • Depth - use a depth map, generated by DepthFM, to guide generation. Some example use cases include generating architectural renderings, or texturing 3D assets.

  • Blur - can be used to perform extremely high fidelity upscaling. A common use case is to tile an input image, apply the ControlNet to each tile, and merge the tiles to produce a higher resolution image.

Model Details

Model Description

This repository holds the ONNX export of the Depth, Canny and Blue ControlNet models in BF16 precision.

Performance using TensorRT 10.13

Depth ControlNet: Timings for 40 steps at 1024x1024

Accelerator Precision VAE Encoder CLIP-G CLIP-L T5 MMDiT x 40 VAE Decoder Total
H100 BF16 74.97 ms 11.87 ms 4.90 ms 8.82 ms 18839.01 ms 117.38 ms 19097.19 ms

Canny ControlNet: Timings for 60 steps at 1024x1024

Accelerator Precision VAE Encoder CLIP-G CLIP-L T5 MMDiT x 60 VAE Decoder Total
H100 BF16 78.50 ms 12.29 ms 5.08 ms 8.65 ms 28057.08 ms 106.49 ms 28306.20 ms

Blur ControlNet: Timings for 60 steps at 1024x1024

Accelerator Precision VAE Encoder CLIP-G CLIP-L T5 MMDiT x 60 VAE Decoder Total
H100 BF16 74.48 ms 11.71 ms 4.86 ms 8.80 ms 28604.26 ms 113.24 ms 28859.06 ms

Usage Example

  1. Follow the setup instructions on launching a TensorRT NGC container.
git clone https://github.com/NVIDIA/TensorRT.git
cd TensorRT
git checkout release/sd35
docker run --rm -it --gpus all -v $PWD:/workspace nvcr.io/nvidia/pytorch:25.01-py3 /bin/bash
  1. Install libraries and requirements
cd demo/Diffusion
python3 -m pip install --upgrade pip
pip3 install -r requirements.txt
python3 -m pip install --pre --upgrade --extra-index-url https://pypi.nvidia.com tensorrt-cu12
  1. Generate HuggingFace user access token To download model checkpoints for the Stable Diffusion 3.5 checkpoints, please request access on theStable Diffusion 3.5 Large, Stable Diffusion 3.5 Large Depth ControlNet, Stable Diffusion 3.5 Large Canny ControlNet, and Stable Diffusion 3.5 Large Blur ControlNet pages. You will then need to obtain a read access token to HuggingFace Hub and export as shown below. See instructions.
export HF_TOKEN=<your access token>
  1. Perform TensorRT optimized inference:
  • Stable Diffusion 3.5 Large Depth ControlNet in BF16 precision

    python3 demo_controlnet_sd35.py \
      "a photo of a man" \
      --version=3.5-large \
      --bf16 \
      --controlnet-type depth \
      --download-onnx-models \
      --denoising-steps=40 \
      --guidance-scale 4.5 \
      --build-static-batch \
      --use-cuda-graph \
      --hf-token=$HF_TOKEN
    
  • Stable Diffusion 3.5 Large Canny ControlNet in BF16 precision

    python3 demo_controlnet_sd35.py \
      "A Night time photo taken by Leica M11, portrait of a Japanese woman in a kimono, looking at the camera, Cherry blossoms" \
      --version=3.5-large \
      --bf16 \
      --controlnet-type canny \
      --download-onnx-models \
      --denoising-steps=60 \
      --guidance-scale 3.5 \
      --build-static-batch \
      --use-cuda-graph \
      --hf-token=$HF_TOKEN
    
  • Stable Diffusion 3.5 Large Blur ControlNet in BF16 precision

    python3 demo_controlnet_sd35.py \
      "generated ai art, a tiny, lost rubber ducky in an action shot close-up, surfing the humongous waves, inside the tube, in the style of Kelly Slater" \
      --version=3.5-large \
      --bf16 \
      --controlnet-type blur \
      --download-onnx-models \
      --denoising-steps=60 \
      --guidance-scale 3.5 \
      --build-static-batch \
      --use-cuda-graph \
      --hf-token=$HF_TOKEN
    
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including stabilityai/stable-diffusion-3.5-controlnets-tensorrt