Any-to-Any
PyTorch
Sierkinhane commited on
Commit
111e2f9
·
verified ·
1 Parent(s): 2fef922

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -15,7 +15,7 @@
15
 
16
  ## What is the new about Show-o2?
17
  We perform the unified learning of multimodal understanding and generation on the text token and **3D Causal VAE space**, which is scalable for **text, image, and video modalities**. A dual-path of spatial (-temporal) fusion is proposed to accommodate the distinct feature dependency of multimodal understanding and generation. We employ specific heads with **autoregressive modeling and flow matching** for the overall unified learning of **multimodal understanding, image/video and mixed-modality generation.**
18
-
19
 
20
  ## Pre-trained Model Weigths
21
  The Show-o2 checkpoints can be found on Hugging Face:
 
15
 
16
  ## What is the new about Show-o2?
17
  We perform the unified learning of multimodal understanding and generation on the text token and **3D Causal VAE space**, which is scalable for **text, image, and video modalities**. A dual-path of spatial (-temporal) fusion is proposed to accommodate the distinct feature dependency of multimodal understanding and generation. We employ specific heads with **autoregressive modeling and flow matching** for the overall unified learning of **multimodal understanding, image/video and mixed-modality generation.**
18
+ <img src="overview.png" width="1000">
19
 
20
  ## Pre-trained Model Weigths
21
  The Show-o2 checkpoints can be found on Hugging Face: