Generate images from text descriptions
Generate detailed prompts for Stable Diffusion
Transcribe audio from microphone, files, or YouTube