SDXL TensorRT-RTX BF16 Ampere
TensorRT-RTX optimized engines for Stable Diffusion XL on NVIDIA Ampere architecture (RTX 30 series, A100, etc.) with BF16 precision.
Model Details
- Base Model: stabilityai/stable-diffusion-xl-base-1.0
- Architecture: AMPERE (Compute Capability 8.6)
- Precision: BF16 (16-bit brain floating point)
- TensorRT-RTX Version: 1.0.0.21
- Image Resolution: 1024x1024
- Batch Size: 1 (static)
Engine Files
This repository contains 4 TensorRT engine files:
clip.trt1.0.0.21.plan
- CLIP text encoderclip2.trt1.0.0.21.plan
- CLIP text encoder 2unetxl.trt1.0.0.21.plan
- U-Net XL diffusion modelvae.trt1.0.0.21.plan
- VAE decoder
Total Size: 6.5GB
Hardware Requirements
- NVIDIA RTX 30 series (RTX 3060, 3070, 3080, 3090) or A100
- Compute Capability 8.6
- Minimum 12GB VRAM recommended
- TensorRT-RTX 1.0.0.21 runtime
Usage
# Example usage with TensorRT-RTX backend
from imageai_server.shared.tensorrt_rtx_backend import TensorRTRTXBackend
backend = TensorRTRTXBackend()
backend.load_engines("path/to/engines")
image = backend.generate("A beautiful sunset over mountains")
Performance
- Inference Speed: ~2-3 seconds per image (RTX 3090)
- Memory Usage: ~6-8GB VRAM
- Optimizations: Static shapes, BF16 precision, Ampere-specific kernels
License
This model is released under the same license as the base SDXL model (OpenRAIL++).
Built With
- TensorRT-RTX 1.0.0.21
- NVIDIA Diffusion Demo
- Built on NVIDIA GeForce RTX 3090 (Ampere 8.6)
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for imgailab/sdxl-bf16-ampere
Base model
stabilityai/stable-diffusion-xl-base-1.0