nielsr HF staff commited on
Commit
aef1284
·
verified ·
1 Parent(s): a064a6c

Add link to paper, code, and project page

Browse files

This PR ensures the model card links to the Github code repository, the technical report, and project page, enabling people to more easily find all the resources.

Files changed (1) hide show
  1. README.md +12 -11
README.md CHANGED
@@ -1,16 +1,17 @@
1
  ---
2
- license: apache-2.0
3
  language:
4
  - en
5
  - zh
 
 
6
  pipeline_tag: text-to-video
7
  tags:
8
  - video generation
9
- library_name: diffusers
10
  inference:
11
  parameters:
12
  num_inference_steps: 10
13
  ---
 
14
  # Wan2.1
15
 
16
  <p align="center">
@@ -18,7 +19,7 @@ inference:
18
  <p>
19
 
20
  <p align="center">
21
- 💜 <a href=""><b>Wan</b></a> &nbsp&nbsp | &nbsp&nbsp 🖥️ <a href="https://github.com/Wan-Video/Wan2.1">GitHub</a> &nbsp&nbsp | &nbsp&nbsp🤗 <a href="https://huggingface.co/Wan-AI/">Hugging Face</a>&nbsp&nbsp | &nbsp&nbsp🤖 <a href="https://modelscope.cn/organization/Wan-AI">ModelScope</a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="">Paper (Coming soon)</a> &nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://wanxai.com">Blog</a> &nbsp&nbsp | &nbsp&nbsp💬 <a href="https://gw.alicdn.com/imgextra/i2/O1CN01tqjWFi1ByuyehkTSB_!!6000000000015-0-tps-611-1279.jpg">WeChat Group</a>&nbsp&nbsp | &nbsp&nbsp 📖 <a href="https://discord.gg/p5XbdQV7">Discord</a>&nbsp&nbsp
22
  <br>
23
 
24
  -----
@@ -68,13 +69,13 @@ This repository features our T2V-14B model, which establishes a new SOTA perform
68
 
69
  #### Installation
70
  Clone the repo:
71
- ```
72
  git clone https://github.com/Wan-Video/Wan2.1.git
73
  cd Wan2.1
74
  ```
75
 
76
  Install dependencies:
77
- ```
78
  # Ensure torch >= 2.4.0
79
  pip install -r requirements.txt
80
  ```
@@ -142,13 +143,13 @@ To facilitate implementation, we will start with a basic version of the inferenc
142
 
143
  - Single-GPU inference
144
 
145
- ```
146
  python generate.py --task t2v-14B --size 1280*720 --ckpt_dir ./Wan2.1-T2V-14B --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
147
  ```
148
 
149
  If you encounter OOM (Out-of-Memory) issues, you can use the `--offload_model True` and `--t5_cpu` options to reduce GPU memory usage. For example, on an RTX 4090 GPU:
150
 
151
- ```
152
  python generate.py --task t2v-1.3B --size 832*480 --ckpt_dir ./Wan2.1-T2V-1.3B --offload_model True --t5_cpu --sample_shift 8 --sample_guide_scale 6 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
153
  ```
154
 
@@ -157,7 +158,7 @@ python generate.py --task t2v-1.3B --size 832*480 --ckpt_dir ./Wan2.1-T2V-1.3B
157
 
158
  - Multi-GPU inference using FSDP + xDiT USP
159
 
160
- ```
161
  pip install "xfuser>=0.4.1"
162
  torchrun --nproc_per_node=8 generate.py --task t2v-14B --size 1280*720 --ckpt_dir ./Wan2.1-T2V-14B --dit_fsdp --t5_fsdp --ulysses_size 8 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
163
  ```
@@ -172,7 +173,7 @@ Extending the prompts can effectively enrich the details in the generated videos
172
  - Configure the environment variable `DASH_API_KEY` to specify the Dashscope API key. For users of Alibaba Cloud's international site, you also need to set the environment variable `DASH_API_URL` to 'https://dashscope-intl.aliyuncs.com/api/v1'. For more detailed instructions, please refer to the [dashscope document](https://www.alibabacloud.com/help/en/model-studio/developer-reference/use-qwen-by-calling-api?spm=a2c63.p38356.0.i1).
173
  - Use the `qwen-plus` model for text-to-video tasks and `qwen-vl-max` for image-to-video tasks.
174
  - You can modify the model used for extension with the parameter `--prompt_extend_model`. For example:
175
- ```
176
  DASH_API_KEY=your_key python generate.py --task t2v-14B --size 1280*720 --ckpt_dir ./Wan2.1-T2V-14B --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage" --use_prompt_extend --prompt_extend_method 'dashscope' --prompt_extend_target_lang 'ch'
177
  ```
178
 
@@ -184,13 +185,13 @@ DASH_API_KEY=your_key python generate.py --task t2v-14B --size 1280*720 --ckpt_
184
  - Larger models generally provide better extension results but require more GPU memory.
185
  - You can modify the model used for extension with the parameter `--prompt_extend_model` , allowing you to specify either a local model path or a Hugging Face model. For example:
186
 
187
- ```
188
  python generate.py --task t2v-14B --size 1280*720 --ckpt_dir ./Wan2.1-T2V-14B --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage" --use_prompt_extend --prompt_extend_method 'local_qwen' --prompt_extend_target_lang 'ch'
189
  ```
190
 
191
  ##### (3) Runing local gradio
192
 
193
- ```
194
  cd gradio
195
  # if one uses dashscope’s API for prompt extension
196
  DASH_API_KEY=your_key python t2v_14B_singleGPU.py --prompt_extend_method 'dashscope' --ckpt_dir ./Wan2.1-T2V-14B
 
1
  ---
 
2
  language:
3
  - en
4
  - zh
5
+ library_name: diffusers
6
+ license: apache-2.0
7
  pipeline_tag: text-to-video
8
  tags:
9
  - video generation
 
10
  inference:
11
  parameters:
12
  num_inference_steps: 10
13
  ---
14
+
15
  # Wan2.1
16
 
17
  <p align="center">
 
19
  <p>
20
 
21
  <p align="center">
22
+ 💜 <a href="https://wan.video"><b>Wan</b></a> &nbsp&nbsp | &nbsp&nbsp 🖥️ <a href="https://github.com/Wan-Video/Wan2.1">GitHub</a> &nbsp&nbsp | &nbsp&nbsp🤗 <a href="https://huggingface.co/Wan-AI/">Hugging Face</a>&nbsp&nbsp | &nbsp&nbsp🤖 <a href="https://modelscope.cn/organization/Wan-AI">ModelScope</a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://huggingface.co/papers/2503.20314">Paper (Coming soon)</a> &nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://wan.video/welcome?spm=a2ty_o02.30011076.0.0.6c9ee41eCcluqg">Blog</a> &nbsp&nbsp | &nbsp&nbsp💬 <a href="https://gw.alicdn.com/imgextra/i2/O1CN01tqjWFi1ByuyehkTSB_!!6000000000015-0-tps-611-1279.jpg">WeChat Group</a>&nbsp&nbsp | &nbsp&nbsp 📖 <a href="https://discord.gg/AKNgpMK4Yj">Discord</a>&nbsp&nbsp
23
  <br>
24
 
25
  -----
 
69
 
70
  #### Installation
71
  Clone the repo:
72
+ ```sh
73
  git clone https://github.com/Wan-Video/Wan2.1.git
74
  cd Wan2.1
75
  ```
76
 
77
  Install dependencies:
78
+ ```sh
79
  # Ensure torch >= 2.4.0
80
  pip install -r requirements.txt
81
  ```
 
143
 
144
  - Single-GPU inference
145
 
146
+ ```sh
147
  python generate.py --task t2v-14B --size 1280*720 --ckpt_dir ./Wan2.1-T2V-14B --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
148
  ```
149
 
150
  If you encounter OOM (Out-of-Memory) issues, you can use the `--offload_model True` and `--t5_cpu` options to reduce GPU memory usage. For example, on an RTX 4090 GPU:
151
 
152
+ ```sh
153
  python generate.py --task t2v-1.3B --size 832*480 --ckpt_dir ./Wan2.1-T2V-1.3B --offload_model True --t5_cpu --sample_shift 8 --sample_guide_scale 6 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
154
  ```
155
 
 
158
 
159
  - Multi-GPU inference using FSDP + xDiT USP
160
 
161
+ ```sh
162
  pip install "xfuser>=0.4.1"
163
  torchrun --nproc_per_node=8 generate.py --task t2v-14B --size 1280*720 --ckpt_dir ./Wan2.1-T2V-14B --dit_fsdp --t5_fsdp --ulysses_size 8 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
164
  ```
 
173
  - Configure the environment variable `DASH_API_KEY` to specify the Dashscope API key. For users of Alibaba Cloud's international site, you also need to set the environment variable `DASH_API_URL` to 'https://dashscope-intl.aliyuncs.com/api/v1'. For more detailed instructions, please refer to the [dashscope document](https://www.alibabacloud.com/help/en/model-studio/developer-reference/use-qwen-by-calling-api?spm=a2c63.p38356.0.i1).
174
  - Use the `qwen-plus` model for text-to-video tasks and `qwen-vl-max` for image-to-video tasks.
175
  - You can modify the model used for extension with the parameter `--prompt_extend_model`. For example:
176
+ ```sh
177
  DASH_API_KEY=your_key python generate.py --task t2v-14B --size 1280*720 --ckpt_dir ./Wan2.1-T2V-14B --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage" --use_prompt_extend --prompt_extend_method 'dashscope' --prompt_extend_target_lang 'ch'
178
  ```
179
 
 
185
  - Larger models generally provide better extension results but require more GPU memory.
186
  - You can modify the model used for extension with the parameter `--prompt_extend_model` , allowing you to specify either a local model path or a Hugging Face model. For example:
187
 
188
+ ```sh
189
  python generate.py --task t2v-14B --size 1280*720 --ckpt_dir ./Wan2.1-T2V-14B --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage" --use_prompt_extend --prompt_extend_method 'local_qwen' --prompt_extend_target_lang 'ch'
190
  ```
191
 
192
  ##### (3) Runing local gradio
193
 
194
+ ```sh
195
  cd gradio
196
  # if one uses dashscope’s API for prompt extension
197
  DASH_API_KEY=your_key python t2v_14B_singleGPU.py --prompt_extend_method 'dashscope' --ckpt_dir ./Wan2.1-T2V-14B