MulliVC: Multi-lingual Voice Conversion With Cycle Consistency Paper • 2408.04708 • Published Aug 8, 2024 • 7
Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model Paper • 2407.07053 • Published Jul 9, 2024 • 43
DIVOTrack: A Novel Dataset and Baseline Method for Cross-View Multi-Object Tracking in DIVerse Open Scenes Paper • 2302.07676 • Published Feb 15, 2023
TableGPT: Towards Unifying Tables, Nature Language and Commands into One GPT Paper • 2307.08674 • Published Jul 17, 2023 • 49
Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts Paper • 2307.07218 • Published Jul 14, 2023 • 27
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias Paper • 2306.03509 • Published Jun 6, 2023 • 4
Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis Paper • 2306.03504 • Published Jun 6, 2023 • 8
Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis Paper • 2306.03504 • Published Jun 6, 2023 • 8
CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-training Paper • 2305.10763 • Published May 18, 2023 • 3