wan-disney-DCM-distilled
This is a standard PEFT LoRA derived from Wan-AI/Wan2.1-T2V-1.3B-Diffusers.
The main validation prompt used during training was:
A black and white animated scene unfolds featuring a distressed upright cow with prominent horns and expressive eyes, suspended by its legs from a hook on a static background wall. A smaller Mickey Mouse-like character enters, standing near a wooden bench, initiating interaction between the two. The cow's posture changes as it leans, stretches, and falls, while the mouse watches with a concerned expression, its face a mixture of curiosity and worry, in a world devoid of color.
Validation settings
- CFG:
1.0
- CFG Rescale:
0.0
- Steps:
8
- Sampler:
FlowMatchEulerDiscreteScheduler
- Seed:
42
- Resolution:
832x480
Note: The validation settings are not necessarily the same as the training settings.
You can find some example images in the following gallery:

- Prompt
- A black and white animated scene unfolds featuring a distressed upright cow with prominent horns and expressive eyes, suspended by its legs from a hook on a static background wall. A smaller Mickey Mouse-like character enters, standing near a wooden bench, initiating interaction between the two. The cow's posture changes as it leans, stretches, and falls, while the mouse watches with a concerned expression, its face a mixture of curiosity and worry, in a world devoid of color.
- Negative Prompt
- 色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走
The text encoder was not trained. You may reuse the base model text encoder for inference.
Training settings
Training epochs: 0
Training steps: 300
Learning rate: 0.0001
- Learning rate schedule: cosine
- Warmup steps: 400000
Max grad value: 0.01
Effective batch size: 2
- Micro-batch size: 2
- Gradient accumulation steps: 1
- Number of GPUs: 1
Gradient checkpointing: True
Prediction type: flow_matching (extra parameters=['shift=17.0'])
Optimizer: adamw_bf16
Trainable parameter precision: Pure BF16
Base model precision:
int8-quanto
Caption dropout probability: 0.1%
LoRA Rank: 128
LoRA Alpha: 128.0
LoRA Dropout: 0.1
LoRA initialisation style: default
Datasets
disney-black-and-white-wan
- Repeats: 10
- Total number of images: 68
- Total number of aspect buckets: 1
- Resolution: 0.2304 megapixels
- Cropped: False
- Crop style: None
- Crop aspect: None
- Used for regularisation data: No
Inference
import torch
from diffusers import DiffusionPipeline
model_id = 'Wan-AI/Wan2.1-T2V-1.3B-Diffusers'
adapter_id = 'bghira/wan-disney-DCM-distilled'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
pipeline.load_lora_weights(adapter_id)
prompt = "A black and white animated scene unfolds featuring a distressed upright cow with prominent horns and expressive eyes, suspended by its legs from a hook on a static background wall. A smaller Mickey Mouse-like character enters, standing near a wooden bench, initiating interaction between the two. The cow's posture changes as it leans, stretches, and falls, while the mouse watches with a concerned expression, its face a mixture of curiosity and worry, in a world devoid of color."
negative_prompt = '色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走'
## Optional: quantise the model to save on vram.
## Note: The model was quantised during training, and so it is recommended to do the same during inference time.
from optimum.quanto import quantize, freeze, qint8
quantize(pipeline.transformer, weights=qint8)
freeze(pipeline.transformer)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
model_output = pipeline(
prompt=prompt,
negative_prompt=negative_prompt,
num_inference_steps=8,
generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
width=832,
height=480,
guidance_scale=1.0,
).images[0]
from diffusers.utils.export_utils import export_to_gif
export_to_gif(model_output, "output.gif", fps=15)
- Downloads last month
- 211
Model tree for bghira/wan-disney-DCM-distilled
Base model
Wan-AI/Wan2.1-T2V-1.3B-Diffusers