File size: 3,471 Bytes
8ef0ffb af411d0 8ef0ffb af411d0 8ef0ffb b484e6a 2ed50e3 8ef0ffb 563e9d1 2ed50e3 8ef0ffb d8e077c 8ef0ffb d8e077c 8ef0ffb d8e077c 8ef0ffb |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
---
license: apache-2.0
language:
- en
tags:
- video
- video-generation
- video-to-video
- controlnet
- diffusers
pipeline_tag: video-to-video
---
# Dilated Controlnet for Wan2.1
<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/63fde49f6315a264aba6a7ed/3w5CQ-quMowfEaS90xyrd.mp4"></video>
This repo contains the code for dilated controlnet module for Wan2.1 model.
Dilated controlnet has less basic blocks and also has `stride` parameter. For Wan1.3B model controlnet blocks count = 8 and stride = 3.
See <a href="https://github.com/TheDenk/wan2.1-dilated-controlnet">Github code</a>.
General scheme

### How to
Clone repo
```bash
git clone https://github.com/TheDenk/wan2.1-dilated-controlnet.git
cd wan2.1-dilated-controlnet
```
Create venv
```bash
python -m venv venv
source venv/bin/activate
```
Install requirements
```bash
pip install -r requirements.txt
```
### Inference examples
#### Inference with cli
```bash
python -m inference.cli_demo \
--video_path "resources/physical-4.mp4" \
--prompt "A balloon filled with water was thrown to the ground, exploding and splashing water in all directions. There were graffiti on the wall, studio lighting, and commercial movie shooting." \
--controlnet_type "hed" \
--controlnet_stride 3 \
--base_model_path Wan-AI/Wan2.1-T2V-1.3B-Diffusers \
--controlnet_model_path TheDenk/wan2.1-t2v-1.3b-controlnet-hed-v1
```
#### Inference with Gradio
```bash
python -m inference.gradio_web_demo \
--controlnet_type "hed" \
--base_model_path Wan-AI/Wan2.1-T2V-1.3B-Diffusers \
--controlnet_model_path TheDenk/wan2.1-t2v-1.3b-controlnet-hed-v1
```
#### Detailed Inference
```bash
python -m inference.cli_demo \
--video_path "resources/physical-4.mp4" \
--prompt "A balloon filled with water was thrown to the ground, exploding and splashing water in all directions. There were graffiti on the wall, studio lighting, and commercial movie shooting." \
--controlnet_type "hed" \
--base_model_path Wan-AI/Wan2.1-T2V-1.3B-Diffusers \
--controlnet_model_path TheDenk/wan2.1-t2v-1.3b-controlnet-hed-v1 \
--controlnet_weight 0.8 \
--controlnet_guidance_start 0.0 \
--controlnet_guidance_end 0.8 \
--controlnet_stride 3 \
--num_inference_steps 50 \
--guidance_scale 5.0 \
--video_height 480 \
--video_width 832 \
--num_frames 81 \
--negative_prompt "Bright tones, overexposed, static, blurred details, subtitles, style, works, paintings, images, static, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, misshapen limbs, fused fingers, still picture, messy background, three legs, many people in the background, walking backwards" \
--seed 42 \
--out_fps 16 \
--output_path "result.mp4"
```
## Acknowledgements
Original code and models [Wan2.1](https://github.com/Wan-Video/Wan2.1).
## Citations
```
@misc{TheDenk,
title={Dilated Controlnet},
author={Karachev Denis},
url={https://github.com/TheDenk/wan2.1-dilated-controlnet},
publisher={Github},
year={2025}
}
```
## Contacts
<p>Issues should be raised directly in the repository. For professional support and recommendations please <a>[email protected]</a>.</p>
|