Upload FP8 quantized model

Browse files

Files changed (1) hide show

README.md +37 -1

README.md CHANGED Viewed

@@ -19,7 +19,13 @@ base_model:
 # FLUX.1-dev-ControlNet-Union-Pro-2.0 (fp8)
-This repository contains an unified ControlNet for FLUX.1-dev model released by [Shakker Labs](https://huggingface.co/Shakker-Labs). We provide an [online demo](https://huggingface.co/spaces/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro-2.0).
 # Keynotes
 In comparison with [Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro](https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro),
@@ -131,6 +137,36 @@ You can adjust controlnet_conditioning_scale and control_guidance_end for strong
 - Pose: use [DWPose](https://github.com/IDEA-Research/DWPose/tree/onnx), controlnet_conditioning_scale=0.9, control_guidance_end=0.65.
 - Gray: use cv2.cvtColor, controlnet_conditioning_scale=0.9, control_guidance_end=0.8.
 # Resources
 - [InstantX/FLUX.1-dev-IP-Adapter](https://huggingface.co/InstantX/FLUX.1-dev-IP-Adapter)
 - [InstantX/FLUX.1-dev-Controlnet-Canny](https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Canny)

 # FLUX.1-dev-ControlNet-Union-Pro-2.0 (fp8)
+This repository contains an unified ControlNet for FLUX.1-dev model released by [Shakker Labs](https://huggingface.co/Shakker-Labs). This version has been quantized to FP8 format for optimized inference performance. We provide an [online demo](https://huggingface.co/spaces/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro-2.0).
+# FP8 Quantization
+This model has been quantized from the original BFloat16 format to FP8 format. The benefits include:
+- **Reduced Memory Usage**: Approximately 50% smaller model size compared to BFloat16/FP16
+- **Faster Inference**: Potential speed improvements, especially on hardware with FP8 support
+- **Minimal Quality Loss**: Carefully calibrated quantization process to preserve output quality
 # Keynotes
 In comparison with [Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro](https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro),
 - Pose: use [DWPose](https://github.com/IDEA-Research/DWPose/tree/onnx), controlnet_conditioning_scale=0.9, control_guidance_end=0.65.
 - Gray: use cv2.cvtColor, controlnet_conditioning_scale=0.9, control_guidance_end=0.8.
+# Using FP8 Model
+This repository includes the FP8 quantized version of the model. To use it, you'll need PyTorch with FP8 support:
+```python
+import torch
+from diffusers.utils import load_image
+from diffusers import FluxControlNetPipeline, FluxControlNetModel
+base_model = 'black-forest-labs/FLUX.1-dev'
+controlnet_model_union_fp8 = 'YOUR_USERNAME/FLUX.1-dev-ControlNet-Union-Pro-2.0-fp8'
+# Load using FP8 data type
+controlnet = FluxControlNetModel.from_pretrained(controlnet_model_union_fp8, torch_dtype=torch.float8_e4m3fn)
+pipe = FluxControlNetPipeline.from_pretrained(base_model, controlnet=controlnet, torch_dtype=torch.bfloat16)
+pipe.to("cuda")
+# The rest of the code is the same as with the original model
+```
+See `fp8_inference_example.py` for a complete example.
+# Pushing Model to Hugging Face Hub
+To push your FP8 quantized model to the Hugging Face Hub, use the included script:
+```bash
+python push_model_to_hub.py --repo_id "YOUR_USERNAME/FLUX.1-dev-ControlNet-Union-Pro-2.0-fp8"
+```
+You will need to have the `huggingface_hub` library installed and be logged in with your Hugging Face credentials.
 # Resources
 - [InstantX/FLUX.1-dev-IP-Adapter](https://huggingface.co/InstantX/FLUX.1-dev-IP-Adapter)
 - [InstantX/FLUX.1-dev-Controlnet-Canny](https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Canny)