Video-to-Video
SeedVR
Iceclear commited on
Commit
9f7bb1a
·
verified ·
1 Parent(s): 4f92a92

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -65
README.md CHANGED
@@ -1,66 +1,67 @@
1
- ---
2
- license: apache-2.0
3
- ---
4
-
5
- <div align="center">
6
- <img src="assets/seedvr_logo.png" alt="SeedVR" width="400"/>
7
- </div>
8
-
9
- # SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration
10
- > [Jianyi Wang](https://iceclear.github.io), [Zhijie Lin](https://scholar.google.com/citations?user=xXMj6_EAAAAJ&hl=zh-CN), [Meng Wei](https://openreview.net/profile?id=~Meng_Wei11), [Ceyuan Yang](https://scholar.google.com/citations?user=uPmTOHAAAAAJ&hl=zh-CN), [Fei Xiao](https://openreview.net/profile?id=~Fei_xiao8), [Chen Change Loy](https://www.mmlab-ntu.com/person/ccloy/), [Lu Jiang](http://www.lujiang.info/)
11
- >
12
- > **CVPR 2025 (Highlight)**
13
-
14
- <p align="center">
15
- <a href="https://iceclear.github.io/projects/seedvr/">
16
- <img
17
- src="https://img.shields.io/badge/SeedVR-Website-0A66C2?logo=safari&logoColor=white"
18
- alt="SeedVR Website"
19
- />
20
- </a>
21
- <a href="https://arxiv.org/abs/2501.01320">
22
- <img
23
- src="https://img.shields.io/badge/SeedVR-Paper-red?logo=arxiv&logoColor=red"
24
- alt="SeedVR Paper on ArXiv"
25
- />
26
- </a>
27
- <a href="https://www.youtube.com/watch?v=aPpBs_B2iCY" target='_blank'>
28
- <img
29
- src="https://img.shields.io/badge/Demo%20Video-%23FF0000.svg?logo=YouTube&logoColor=white"
30
- alt="SeedVR Video Demo on YouTube"
31
- />
32
- </a>
33
- </p>
34
-
35
- >
36
- > **Why SeedVR:** Conventional restoration models achieve inferior performance on both real-world and AIGC video restoration due to limited generation ability. Recent diffusion-based models improve the performance by introducing diffusion prior via ControlNet-like or adaptor-like architectures. Though gaining improvement, these methods generally suffer from constraints brought by the diffusion prior: these models suffer from the same bias as the prior, e.g., limited generation ability on small texts and faces, etc, and only work on fixed resolutions such as 512 or 1024. As a result, most of the existing diffusion-based restoration models rely on patch-based sampling, i.e., dividing the input video into overlapping spatial-temporal patches and fusing these patches using a Gaussian kernel at each diffusion step. The large overlap (e.g., 50\% of the patch size), required for ensuring a coherent output without visible patch boundaries, often leads to considerably slow inference speed. This inefficiency becomes even more pronounced when processing long videos at high resolutions. SeedVR follows SOTA video generation training pipelines to tackle the key challenge in diffusion-based restoration, i.e., by enabling arbitrary-resolution restoration w/o relying on any pretrained diffusion prior and introducing advanced video generation technologies suitable for video restoration. Serving as the largest-ever diffusion transformer model towards generic video restoration, we hope SeedVR could push the frontiers of advanced VR and inspire future research in developing large vision models for real-world video restoration.
37
-
38
-
39
- <p align="center"><img src="assets/teaser.png" width="100%"></p>
40
-
41
-
42
- ## 📮 Notice
43
- **Limitations:** These are the prototype models and the performance may not be perfectly align with the paper. Our methods are sometimes not robust to heavy degradations and very large motions, and shares some failure cases with existing methods, e.g., fail to fully remove the degradation or simply generate unpleasing details. Moreover, due to the strong generation ability, Our methods tend to overly generate details on inputs with very light degradations, e.g., 720p AIGC videos, leading to oversharpened results occasionally.
44
-
45
-
46
- ## ✍️ Citation
47
-
48
- ```bibtex
49
- @article{wang2025seedvr2,
50
- title={SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training},
51
- author={Wang, Jianyi and Lin, Shanchuan and Lin, Zhijie and Ren, Yuxi and Wei, Meng and Yue, Zongsheng and Zhou, Shangchen and Chen, Hao and Zhao, Yang and Yang, Ceyuan and Xiao, Xuefeng and Loy, Chen Change and Jiang, Lu},
52
- booktitle={arXiv preprint arXiv:2506.05301},
53
- year={2025}
54
- }
55
-
56
- @inproceedings{wang2025seedvr,
57
- title={SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration},
58
- author={Wang, Jianyi and Lin, Zhijie and Wei, Meng and Zhao, Yang and Yang, Ceyuan and Loy, Chen Change and Jiang, Lu},
59
- booktitle={CVPR},
60
- year={2025}
61
- }
62
- ```
63
-
64
-
65
- ## 📜 License
 
66
  SeedVR and SeedVR2 are licensed under the Apache 2.0.
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: seedvr
4
+ ---
5
+
6
+ <div align="center">
7
+ <img src="assets/seedvr_logo.png" alt="SeedVR" width="400"/>
8
+ </div>
9
+
10
+ # SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration
11
+ > [Jianyi Wang](https://iceclear.github.io), [Zhijie Lin](https://scholar.google.com/citations?user=xXMj6_EAAAAJ&hl=zh-CN), [Meng Wei](https://openreview.net/profile?id=~Meng_Wei11), [Ceyuan Yang](https://scholar.google.com/citations?user=uPmTOHAAAAAJ&hl=zh-CN), [Fei Xiao](https://openreview.net/profile?id=~Fei_xiao8), [Chen Change Loy](https://www.mmlab-ntu.com/person/ccloy/), [Lu Jiang](http://www.lujiang.info/)
12
+ >
13
+ > **CVPR 2025 (Highlight)**
14
+
15
+ <p align="center">
16
+ <a href="https://iceclear.github.io/projects/seedvr/">
17
+ <img
18
+ src="https://img.shields.io/badge/SeedVR-Website-0A66C2?logo=safari&logoColor=white"
19
+ alt="SeedVR Website"
20
+ />
21
+ </a>
22
+ <a href="https://arxiv.org/abs/2501.01320">
23
+ <img
24
+ src="https://img.shields.io/badge/SeedVR-Paper-red?logo=arxiv&logoColor=red"
25
+ alt="SeedVR Paper on ArXiv"
26
+ />
27
+ </a>
28
+ <a href="https://www.youtube.com/watch?v=aPpBs_B2iCY" target='_blank'>
29
+ <img
30
+ src="https://img.shields.io/badge/Demo%20Video-%23FF0000.svg?logo=YouTube&logoColor=white"
31
+ alt="SeedVR Video Demo on YouTube"
32
+ />
33
+ </a>
34
+ </p>
35
+
36
+ >
37
+ > **Why SeedVR:** Conventional restoration models achieve inferior performance on both real-world and AIGC video restoration due to limited generation ability. Recent diffusion-based models improve the performance by introducing diffusion prior via ControlNet-like or adaptor-like architectures. Though gaining improvement, these methods generally suffer from constraints brought by the diffusion prior: these models suffer from the same bias as the prior, e.g., limited generation ability on small texts and faces, etc, and only work on fixed resolutions such as 512 or 1024. As a result, most of the existing diffusion-based restoration models rely on patch-based sampling, i.e., dividing the input video into overlapping spatial-temporal patches and fusing these patches using a Gaussian kernel at each diffusion step. The large overlap (e.g., 50\% of the patch size), required for ensuring a coherent output without visible patch boundaries, often leads to considerably slow inference speed. This inefficiency becomes even more pronounced when processing long videos at high resolutions. SeedVR follows SOTA video generation training pipelines to tackle the key challenge in diffusion-based restoration, i.e., by enabling arbitrary-resolution restoration w/o relying on any pretrained diffusion prior and introducing advanced video generation technologies suitable for video restoration. Serving as the largest-ever diffusion transformer model towards generic video restoration, we hope SeedVR could push the frontiers of advanced VR and inspire future research in developing large vision models for real-world video restoration.
38
+
39
+
40
+ <p align="center"><img src="assets/teaser.png" width="100%"></p>
41
+
42
+
43
+ ## 📮 Notice
44
+ **Limitations:** These are the prototype models and the performance may not be perfectly align with the paper. Our methods are sometimes not robust to heavy degradations and very large motions, and shares some failure cases with existing methods, e.g., fail to fully remove the degradation or simply generate unpleasing details. Moreover, due to the strong generation ability, Our methods tend to overly generate details on inputs with very light degradations, e.g., 720p AIGC videos, leading to oversharpened results occasionally.
45
+
46
+
47
+ ## ✍️ Citation
48
+
49
+ ```bibtex
50
+ @article{wang2025seedvr2,
51
+ title={SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training},
52
+ author={Wang, Jianyi and Lin, Shanchuan and Lin, Zhijie and Ren, Yuxi and Wei, Meng and Yue, Zongsheng and Zhou, Shangchen and Chen, Hao and Zhao, Yang and Yang, Ceyuan and Xiao, Xuefeng and Loy, Chen Change and Jiang, Lu},
53
+ booktitle={arXiv preprint arXiv:2506.05301},
54
+ year={2025}
55
+ }
56
+
57
+ @inproceedings{wang2025seedvr,
58
+ title={SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration},
59
+ author={Wang, Jianyi and Lin, Zhijie and Wei, Meng and Zhao, Yang and Yang, Ceyuan and Loy, Chen Change and Jiang, Lu},
60
+ booktitle={CVPR},
61
+ year={2025}
62
+ }
63
+ ```
64
+
65
+
66
+ ## 📜 License
67
  SeedVR and SeedVR2 are licensed under the Apache 2.0.