How to Use

`diffusers`

import torch
from diffusers import LongCatImageEditPipeline, LongCatImageTransformer2DModel
from transformers.modeling_utils import no_init_weights
with no_init_weights():
    transformer = LongCatImageTransformer2DModel.from_config(
        LongCatImageTransformer2DModel.load_config(
            "meituan-longcat/LongCat-Image-Edit", subfolder="transformer"
        ),
        torch_dtype=torch.bfloat16
    ).to(torch.bfloat16)
DFloat11Model.from_pretrained(
    "mingyi456/LongCat-Image-Edit-DF11",
    device="cpu",
    bfloat16_model=transformer,
)
pipe = LongCatImageEditPipeline.from_pretrained(
    "meituan-longcat/LongCat-Image-Edit",
    transformer=transformer, 
    torch_dtype=torch.bfloat16
)
DFloat11Model.from_pretrained(
    "mingyi456/Qwen2.5-VL-7B-Instruct-DF11",
    device="cpu",
    bfloat16_model=pipe.text_encoder,
)
pipe.enable_model_cpu_offload()

img = Image.open('assets/test.png').convert('RGB')
prompt = '将猫变成狗'
image = pipe(
    img,
    prompt,
    negative_prompt='',
    guidance_scale=4.5,
    num_inference_steps=50,
    num_images_per_prompt=1,
    generator=torch.Generator("cpu").manual_seed(43)
).images[0]

image.save('image longcat-image-edit.png')

ComfyUI

Currently, this model is not supported natively in ComfyUI. Do let me know if it receives native support, and I will get to supporting it.

Compression details

This is the pattern_dict for compression:

pattern_dict = {
    r"transformer_blocks\.\d+": (
        "norm1.linear",
        "norm1_context.linear",
        "attn.to_q",
        "attn.to_k",
        "attn.to_v",
        "attn.to_out.0",
        "attn.add_q_proj",
        "attn.add_k_proj",
        "attn.add_v_proj",
        "attn.to_add_out",
        "ff.net.0.proj",
        "ff.net.2",
        "ff_context.net.0.proj",
        "ff_context.net.2",
    ),
    r"single_transformer_blocks\.\d+": (
        "norm.linear",
        "proj_mlp",
        "proj_out",
        "attn.to_q",
        "attn.to_k",
        "attn.to_v",
    ),
}

Downloads last month: 6

Model tree for mingyi456/LongCat-Image-Edit-DF11

Base model

meituan-longcat/LongCat-Image-Edit

Quantized

(3)

this model