Stunning images using stable diffusion.
Generate images using prompts and masks
Voice conversion framework based on VITS