Co-Speech Gesture Video Generation
Generate a short video from an image
Generate images from text descriptions