DFloat11
/

FLUX.1-dev-DF11

lossless compression

70% size, 100% accuracy

Model card Files Files and versions

LeanQuant commited on Aug 28

Commit

d662d6d

·

verified ·

1 Parent(s): dbb180b

Update README.md

Files changed (1) hide show

README.md +37 -12

README.md CHANGED Viewed

@@ -26,7 +26,7 @@ Key benefits:
 ### 🔧 How to Use
-1. Install the DFloat11 pip package *(installs the CUDA kernel automatically; requires a CUDA-compatible GPU and PyTorch installed)*:
     ```bash
     pip install dfloat11[cuda12]
@@ -34,31 +34,56 @@ Key benefits:
     # pip install dfloat11[cuda11]
     ```
-2. To use the DFloat11 model, run the following example code in Python:
     ```python
     import torch
-    from diffusers import FluxPipeline
     from dfloat11 import DFloat11Model
-    pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
     pipe.enable_model_cpu_offload()
-    DFloat11Model.from_pretrained('DFloat11/FLUX.1-dev-DF11', device='cpu', bfloat16_model=pipe.transformer)
-    prompt = "A futuristic cityscape at sunset, with flying cars, neon lights, and reflective water canals"
     image = pipe(
         prompt,
-        width=1920,
-        height=1440,
         guidance_scale=3.5,
         num_inference_steps=50,
         max_sequence_length=512,
         generator=torch.Generator(device="cuda").manual_seed(0)
     ).images[0]
     image.save("image.png")
     ```
 ### 📄 Learn More
 * **Paper**: [70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float](https://arxiv.org/abs/2504.11651)

 ### 🔧 How to Use
+1. Install or upgrade the DFloat11 package *(installs the CUDA kernel automatically; requires a CUDA-compatible GPU and PyTorch installed)*:
     ```bash
     pip install dfloat11[cuda12]
     # pip install dfloat11[cuda11]
     ```
+2. Install or upgrade the diffusers package:
+    ```bash
+    pip install -U diffusers
+    ```
+3. Save the following code as a Python file `flux1.py`:
     ```python
     import torch
+    from diffusers import FluxPipeline, FluxTransformer2DModel
     from dfloat11 import DFloat11Model
+    from transformers.modeling_utils import no_init_weights
+    with no_init_weights():
+        transformer = FluxTransformer2DModel.from_config(
+            FluxTransformer2DModel.load_config(
+                "black-forest-labs/FLUX.1-dev", subfolder="transformer"
+            )
+        ).to(torch.bfloat16)
+    pipe = FluxPipeline.from_pretrained(
+        "black-forest-labs/FLUX.1-dev",
+        transformer=transformer,
+        torch_dtype=torch.bfloat16
+    )
+    DFloat11Model.from_pretrained(
+        'DFloat11/FLUX.1-dev-DF11',
+        device='cpu',
+        bfloat16_model=pipe.transformer,
+    )
     pipe.enable_model_cpu_offload()
+    prompt = "A scenic landscape with mountains, a river, and a clear sky."
     image = pipe(
         prompt,
+        width=1024,
+        height=1024,
         guidance_scale=3.5,
         num_inference_steps=50,
         max_sequence_length=512,
         generator=torch.Generator(device="cuda").manual_seed(0)
     ).images[0]
     image.save("image.png")
     ```
+4. Run `python flux1.py` in your terminal.
 ### 📄 Learn More
 * **Paper**: [70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float](https://arxiv.org/abs/2504.11651)