
VAST-AI/MIDI-3D
Image-to-3D
โข
Updated
โข
448
โข
39
Generate text and speech responses from text, images, or audio input
Remove background from ID photos
Generate images from text prompts
Large Animatable Human Model
Try Orpheus TTS here
Convert vocals to match reference audio
Audio Conditioned LipSync with Latent Diffusion Models
Wan: Open and Advanced Large-Scale Video Generative Models