Spaces:

EarthnDusk
/

SDXL_To_Diffusers

Running

App Files Files Community

Duskfallcrew commited on Feb 23

Commit

1126c53

verified ·

1 Parent(s): ced2dd0

Update app.py

Browse files

Key Changes and Explanations:

Focused on Checkpoint to Diffusers: The code now only handles the conversion from an SDXL checkpoint (either .ckpt or .safetensors) to the Diffusers format. All checkpoint saving functionality has been removed.

load_sdxl_checkpoint: This function remains largely the same, loading the state dict and converting the relevant tensors to fp16.

build_diffusers_model Function:: This NEW function takes the extracted state dictionaries and constructs the actual Diffusers model components (Text Encoders, VAE, UNet).
* It loads the configurations (e.g., CLIPTextConfig, UNet2DConditionModel.config) from a reference model. This is important because the checkpoint only contains the weights, not the architecture definition. You provide the path to a reference Diffusers model (like stabilityai/stable-diffusion-xl-base-1.0) in the Gradio interface, and the code uses that to get the correct configurations. If no refernce model path is given, it defaults to stabilityai/stable-diffusion-xl-base-1.0
* It then creates empty instances of the models (e.g., CLIPTextModel(config_text_encoder1)).
* It loads the extracted state_dict (weights) into these empty models using model.load_state_dict(state_dict).
* It explicitly moves the models to fp16 using .to(torch.float16).

convert_and_save_sdxl_to_diffusers: This function now:

Calls load_sdxl_checkpoint to get the component state dictionaries.

Calls build_diffusers_model to construct the Diffusers model components.

Creates a StableDiffusionXLPipeline from these components. Important: You'll likely need to add the tokenizer, tokenizer_2, and scheduler to the pipeline creation as well. These are typically part of a Diffusers model, and you should load them from the same reference model you used for the configurations. I've added placeholder lines where you should do this.

Saves the pipeline using pipeline.save_pretrained(output_path).

upload_to_huggingface (Using upload_folder): Crucially, I've changed this to use api.upload_folder(folder_path=model_path, repo_id=model_repo). This is the correct way to upload an entire Diffusers model directory to Hugging Face Hub. The previous print statement was incorrect.

main Function (Simplified): The main function is now much simpler, only calling the conversion and upload functions.

Gradio Interface:

The model_to_load label is clarified to specify that it's for checkpoints.

The output_path label is clarified to indicate that it's for the Diffusers format output.

The reference_model input is now optional, and will default.

Removed Unnecessary functions: Removed the checkpoint saving and determination functions.

Gemini probably broke it again but it's ok i saved the old code XD

Files changed (1) hide show

app.py +87 -82

app.py CHANGED Viewed

@@ -2,7 +2,7 @@ import os
 import gradio as gr
 import torch
 from diffusers import StableDiffusionXLPipeline, UNet2DConditionModel, AutoencoderKL
-from transformers import CLIPTextModel
 from safetensors.torch import load_file
 from collections import OrderedDict
 import re
@@ -25,7 +25,7 @@ from huggingface_hub.errors import HfHubHTTPError
 # ---------------------- DEPENDENCIES ----------------------
 def install_dependencies_gradio():
-    """Installs the necessary dependencies for the Gradio app. Run this ONCE."""
     try:
         subprocess.run(["pip", "install", "-U", "torch", "diffusers", "transformers", "accelerate", "safetensors", "huggingface_hub", "xformers"])
         print("Dependencies installed successfully.")
@@ -33,26 +33,6 @@ def install_dependencies_gradio():
         print(f"Error installing dependencies: {e}")
 # ---------------------- UTILITY FUNCTIONS ----------------------
-def get_save_dtype(save_precision_as):
-    """Determines the save dtype based on the user's choice."""
-    if save_precision_as == "fp16":
-        return torch.float16
-    elif save_precision_as == "bf16":
-        return torch.bfloat16
-    elif save_precision_as == "float":
-        return torch.float32
-    else:
-        return None
-def determine_load_checkpoint(model_to_load):
-    """Determines if the model to load is a checkpoint or a Diffusers model."""
-    if model_to_load.endswith('.ckpt') or model_to_load.endswith('.safetensors'):
-        return True
-    elif os.path.isdir(model_to_load):
-        required_folders = {"unet", "text_encoder", "text_encoder_2", "tokenizer", "tokenizer_2", "scheduler", "vae"}
-        if required_folders.issubset(set(os.listdir(model_to_load))) and os.path.isfile(os.path.join(model_to_load, "model_index.json")):
-            return False
-    return None
 def increment_filename(filename):
     """Increments the filename to avoid overwriting existing files."""
@@ -63,72 +43,105 @@ def increment_filename(filename):
         counter += 1
     return filename
-# ---------------------- UPLOAD FUNCTION ----------------------# ---------------------- UPLOAD FUNCTION ----------------------
 def create_model_repo(api, user, orgs_name, model_name, make_private=False):
-    """Creates a Hugging Face model repository if it doesn't exist."""
     repo_id = f"{orgs_name}/{model_name.strip()}" if orgs_name else f"{user['name']}/{model_name.strip()}"
     try:
-        # Attempt to create the repository
         api.create_repo(repo_id=repo_id, repo_type="model", private=make_private)
         print(f"Model repo '{repo_id}' created.")
     except HfHubHTTPError:
         print(f"Model repo '{repo_id}' already exists.")
     return repo_id
 # ---------------------- MODEL LOADING AND CONVERSION ----------------------
-def load_sdxl_model(model_to_load, is_load_checkpoint, load_dtype):
-    """Loads the SDXL model from a checkpoint or Diffusers model."""
-    model_load_message = "checkpoint" if is_load_checkpoint else "Diffusers" + (" as fp16" if load_dtype == torch.float16 else "")
-    print(f"Loading {model_load_message}: {model_to_load}")
-    if is_load_checkpoint:
-        return load_from_sdxl_checkpoint(model_to_load)
     else:
-        return load_sdxl_from_diffusers(model_to_load, load_dtype)
-def load_from_sdxl_checkpoint(model_to_load):
-    """Loads the SDXL model components from a checkpoint file."""
-    # Implement loading logic here
-    text_encoder1, text_encoder2, vae, unet = None, None, None, None
-    # Example loading logic (replace with actual loading code)
-    # text_encoder1, text_encoder2, vae, unet = sdxl_model_util.load_models_from_sdxl_checkpoint("sdxl_base_v1-0", model_to_load, "cpu")
-    print(f"Loaded from checkpoint: {model_to_load}")
-    return text_encoder1, text_encoder2, vae, unet
-def load_sdxl_from_diffusers(model_to_load, load_dtype):
-    """Loads an SDXL model from a Diffusers model directory."""
-    pipeline = StableDiffusionXLPipeline.from_pretrained(model_to_load, torch_dtype=load_dtype)
-    text_encoder1 = pipeline.text_encoder
-    text_encoder2 = pipeline.text_encoder_2
-    vae = pipeline.vae
-    unet = pipeline.unet
     return text_encoder1, text_encoder2, vae, unet
-def convert_and_save_sdxl_model(model_to_load, is_save_checkpoint, loaded_model_data, save_dtype):
-    """Converts and saves the SDXL model as either a checkpoint or a Diffusers model."""
-    text_encoder1, text_encoder2, vae, unet = loaded_model_data
-    if is_save_checkpoint:
-        save_sdxl_as_checkpoint(model_to_load, text_encoder1, text_encoder2, vae, unet, save_dtype)
-    else:
-        save_sdxl_as_diffusers(model_to_load, text_encoder1, text_encoder2, vae, unet, save_dtype)
-def save_sdxl_as_checkpoint(model_to_save, text_encoder1, text_encoder2, vae, unet, save_dtype):
-    """Saves the SDXL model components as a checkpoint file."""
-    # Implement saving logic here
-    print(f"Model saved as checkpoint: {model_to_save}")
-def save_sdxl_as_diffusers(model_to_save, text_encoder1, text_encoder2, vae, unet, save_dtype):
-    """Saves the SDXL model as a Diffusers model."""
     pipeline = StableDiffusionXLPipeline(
         vae=vae,
         text_encoder=text_encoder1,
         text_encoder_2=text_encoder2,
-        unet=unet
     )
-    pipeline.save_pretrained(model_to_save)
-    print(f"Model saved as Diffusers format: {model_to_save}")
 # ---------------------- UPLOAD FUNCTION ----------------------
 def upload_to_huggingface(model_path, hf_token, orgs_name, model_name, make_private):
@@ -137,30 +150,22 @@ def upload_to_huggingface(model_path, hf_token, orgs_name, model_name, make_priv
     api = HfApi()
     user = api.whoami(hf_token)
     model_repo = create_model_repo(api, user, orgs_name, model_name, make_private)
-    # Upload logic here
     print(f"Model uploaded to: https://huggingface.co/{model_repo}")
 # ---------------------- GRADIO INTERFACE ----------------------
-def main(model_to_load, save_precision_as, epoch, global_step, reference_model, output_path, fp16, hf_token, orgs_name, model_name, make_private):
-    """Main function orchestrating the entire process."""
-    load_dtype = get_save_dtype(save_precision_as)
-    is_load_checkpoint = determine_load_checkpoint(model_to_load)
-    is_save_checkpoint = not is_load_checkpoint
-    loaded_model_data = load_sdxl_model(model_to_load, is_load_checkpoint, load_dtype)
-    convert_and_save_sdxl_model(model_to_load, is_save_checkpoint, loaded_model_data, load_dtype)
     upload_to_huggingface(output_path, hf_token, orgs_name, model_name, make_private)
     return "Conversion and upload completed successfully!"
 with gr.Blocks() as demo:
-    model_to_load = gr.Textbox(label="Model to Load (Checkpoint or Diffusers)", placeholder="Path to model")
-    save_precision_as = gr.Dropdown(choices=["fp16", "bf16", "float"], label="Save Precision As")
-    epoch = gr.Number(value=0, label="Epoch to Write (Checkpoint)")
-    global_step = gr.Number(value=0, label="Global Step to Write (Checkpoint)")
-    reference_model = gr.Textbox(label="Reference Diffusers Model", placeholder="e.g., stabilityai/stable-diffusion-xl-base-1.0")
-    output_path = gr.Textbox(label="Output Path", value="/content/output")
     hf_token = gr.Textbox(label="Hugging Face Token", placeholder="Your Hugging Face write token")
     orgs_name = gr.Textbox(label="Organization Name (Optional)", placeholder="Your organization name")
     model_name = gr.Textbox(label="Model Name", placeholder="The name of your model on Hugging Face")
@@ -169,6 +174,6 @@ with gr.Blocks() as demo:
     convert_button = gr.Button("Convert and Upload")
     output = gr.Markdown()
-    convert_button.click(fn=main, inputs=[model_to_load, save_precision_as, epoch, global_step, reference_model, output_path, fp16, hf_token, orgs_name, model_name, make_private], outputs=output)
 demo.launch()

 import gradio as gr
 import torch
 from diffusers import StableDiffusionXLPipeline, UNet2DConditionModel, AutoencoderKL
+from transformers import CLIPTextModel, CLIPTextConfig
 from safetensors.torch import load_file
 from collections import OrderedDict
 import re
 # ---------------------- DEPENDENCIES ----------------------
 def install_dependencies_gradio():
+    """Installs the necessary dependencies."""
     try:
         subprocess.run(["pip", "install", "-U", "torch", "diffusers", "transformers", "accelerate", "safetensors", "huggingface_hub", "xformers"])
         print("Dependencies installed successfully.")
         print(f"Error installing dependencies: {e}")
 # ---------------------- UTILITY FUNCTIONS ----------------------
 def increment_filename(filename):
     """Increments the filename to avoid overwriting existing files."""
         counter += 1
     return filename
+# ---------------------- UPLOAD FUNCTION ----------------------
 def create_model_repo(api, user, orgs_name, model_name, make_private=False):
+    """Creates a Hugging Face model repository."""
     repo_id = f"{orgs_name}/{model_name.strip()}" if orgs_name else f"{user['name']}/{model_name.strip()}"
     try:
         api.create_repo(repo_id=repo_id, repo_type="model", private=make_private)
         print(f"Model repo '{repo_id}' created.")
     except HfHubHTTPError:
         print(f"Model repo '{repo_id}' already exists.")
     return repo_id
 # ---------------------- MODEL LOADING AND CONVERSION ----------------------
+def load_sdxl_checkpoint(checkpoint_path):
+    """Loads an SDXL checkpoint (.ckpt or .safetensors) and returns components."""
+    if checkpoint_path.endswith(".safetensors"):
+        state_dict = load_file(checkpoint_path, device="cpu")
+    elif checkpoint_path.endswith(".ckpt"):
+        state_dict = torch.load(checkpoint_path, map_location="cpu")["state_dict"]
     else:
+        raise ValueError("Unsupported checkpoint format. Must be .safetensors or .ckpt")
+    text_encoder1_state = OrderedDict()
+    text_encoder2_state = OrderedDict()
+    vae_state = OrderedDict()
+    unet_state = OrderedDict()
+    for key, value in state_dict.items():
+        if key.startswith("first_stage_model."):  # VAE
+            vae_state[key.replace("first_stage_model.", "")] = value.to(torch.float16)
+        elif key.startswith("condition_model.model.text_encoder."):  # Text Encoder 1
+            text_encoder1_state[key.replace("condition_model.model.text_encoder.", "")] = value.to(torch.float16)
+        elif key.startswith("condition_model.model.text_encoder_2."):  # Text Encoder 2
+            text_encoder2_state[key.replace("condition_model.model.text_encoder_2.", "")] = value.to(torch.float16)
+        elif key.startswith("model.diffusion_model."):  # UNet
+            unet_state[key.replace("model.diffusion_model.", "")] = value.to(torch.float16)
+    return text_encoder1_state, text_encoder2_state, vae_state, unet_state
+def build_diffusers_model(text_encoder1_state, text_encoder2_state, vae_state, unet_state, reference_model_path=None):
+    """Builds the Diffusers pipeline components from the loaded state dicts."""
+    # --- Load configurations, create models (empty), load state dicts ---
+    # 1. Text Encoders
+    if reference_model_path:
+        config_text_encoder1 = CLIPTextConfig.from_pretrained(reference_model_path, subfolder="text_encoder")
+        config_text_encoder2 = CLIPTextConfig.from_pretrained(reference_model_path, subfolder="text_encoder_2")
+    else: #Default
+        config_text_encoder1 = CLIPTextConfig.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="text_encoder")
+        config_text_encoder2 = CLIPTextConfig.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="text_encoder_2")
+    text_encoder1 = CLIPTextModel(config_text_encoder1)
+    text_encoder2 = CLIPTextModel(config_text_encoder2)
+    text_encoder1.load_state_dict(text_encoder1_state)
+    text_encoder2.load_state_dict(text_encoder2_state)
+    text_encoder1.to(torch.float16)  # Ensure fp16
+    text_encoder2.to(torch.float16)
+    # 2. VAE
+    if reference_model_path:
+          vae = AutoencoderKL.from_pretrained(reference_model_path, subfolder="vae")
+    else:
+          vae = AutoencoderKL.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="vae")
+    vae.load_state_dict(vae_state)
+    vae.to(torch.float16)
+    # 3. UNet
+    if reference_model_path:
+      unet = UNet2DConditionModel.from_pretrained(reference_model_path, subfolder="unet")
+    else:
+      unet = UNet2DConditionModel.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="unet")
+    unet.load_state_dict(unet_state)
+    unet.to(torch.float16)
     return text_encoder1, text_encoder2, vae, unet
+def convert_and_save_sdxl_to_diffusers(checkpoint_path, output_path, reference_model_path):
+    """Converts an SDXL checkpoint to Diffusers format and saves it."""
+    text_encoder1_state, text_encoder2_state, vae_state, unet_state = load_sdxl_checkpoint(checkpoint_path)
+    text_encoder1, text_encoder2, vae, unet = build_diffusers_model(text_encoder1_state, text_encoder2_state, vae_state, unet_state, reference_model_path)
     pipeline = StableDiffusionXLPipeline(
         vae=vae,
         text_encoder=text_encoder1,
         text_encoder_2=text_encoder2,
+        unet=unet,
+        # You'll likely need to add tokenizer, scheduler, etc., here from the reference model
+        tokenizer = pipeline.tokenizer,
+        tokenizer_2 = pipeline.tokenizer_2,
+        scheduler = pipeline.scheduler
     )
+    pipeline.save_pretrained(output_path)
+    print(f"Model saved as Diffusers format: {output_path}")
 # ---------------------- UPLOAD FUNCTION ----------------------
 def upload_to_huggingface(model_path, hf_token, orgs_name, model_name, make_private):
     api = HfApi()
     user = api.whoami(hf_token)
     model_repo = create_model_repo(api, user, orgs_name, model_name, make_private)
+    api.upload_folder(folder_path=model_path, repo_id=model_repo)  # Use upload_folder
     print(f"Model uploaded to: https://huggingface.co/{model_repo}")
 # ---------------------- GRADIO INTERFACE ----------------------
+def main(model_to_load, reference_model, output_path, hf_token, orgs_name, model_name, make_private):
+    """Main function: SDXL checkpoint to Diffusers, always fp16."""
+    convert_and_save_sdxl_to_diffusers(model_to_load, output_path, reference_model)
     upload_to_huggingface(output_path, hf_token, orgs_name, model_name, make_private)
     return "Conversion and upload completed successfully!"
 with gr.Blocks() as demo:
+    model_to_load = gr.Textbox(label="SDXL Checkpoint to Load (.ckpt or .safetensors)", placeholder="Path to checkpoint")
+    reference_model = gr.Textbox(label="Reference Diffusers Model (Optional)", placeholder="e.g., stabilityai/stable-diffusion-xl-base-1.0 (Leave blank for default)")
+    output_path = gr.Textbox(label="Output Path (Diffusers Format)", value="/content/output")  # Clarified label
     hf_token = gr.Textbox(label="Hugging Face Token", placeholder="Your Hugging Face write token")
     orgs_name = gr.Textbox(label="Organization Name (Optional)", placeholder="Your organization name")
     model_name = gr.Textbox(label="Model Name", placeholder="The name of your model on Hugging Face")
     convert_button = gr.Button("Convert and Upload")
     output = gr.Markdown()
+    convert_button.click(fn=main, inputs=[model_to_load, reference_model, output_path, hf_token, orgs_name, model_name, make_private], outputs=output)
 demo.launch()