PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation Paper • 2407.02869 • Published Jul 3 • 18
LiveSpeech: Low-Latency Zero-shot Text-to-Speech via Autoregressive Modeling of Audio Discrete Codes Paper • 2406.02897 • Published Jun 5 • 13
Audio Mamba: Bidirectional State Space Model for Audio Representation Learning Paper • 2406.03344 • Published Jun 5 • 18