Generate videos from text prompts with optional images
Generate MIDI music from prompts
High-fidelity Text-To-Speech