
microsoft/Phi-4-multimodal-instruct
Automatic Speech Recognition
•
Updated
•
1.12M
•
1.27k
Generate depth video from input video
Text-to-3D and Image-to-3D Generation
Find matching images using configurations
High-quality speech synthesis powered by Kokoro TTS
Next-generation reasoning model that runs locally in-browser