Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper โข 2412.15322 โข Published 5 days ago โข 13
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper โข 2412.15322 โข Published 5 days ago โข 13
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper โข 2412.15322 โข Published 5 days ago โข 13 โข 2
Byte Latent Transformer: Patches Scale Better Than Tokens Paper โข 2412.09871 โข Published 12 days ago โข 74
Putting the Object Back into Video Object Segmentation Paper โข 2310.12982 โข Published Oct 19, 2023