
nari-labs/Dia-1.6B-0626
Text-to-Speech
β’
2B
β’
Updated
β’
18.4k
β’
39
Control 3D models using hand gestures and voice commands
Generate protein fitness scores and visualizations
Ultra Fast FLUX Kontext Dev for Image Editing
Audio-Driven Multi-Person Conversational Video Generation
edit images with Kontext and LoRAs
Hand-controlled arpeggiator, drum machine, and visualizer
Describe images, videos, and audio
Kontext multi image composition on FLUX[dev]
LightGlue demo
Convert web content to JSON using a custom schema
OmniGen2: Unified Image Understanding and Generation.
Generate or edit images using text prompts