microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated May 1 • 348k • 1.45k
Running 554 554 Talking Face Generation with Multilingual TTS 👄 Generate a talking face video from text