Unified MLLM with Text-Aligned Representations
Open Veo3-style Audio-Video Generation
THUDM/GLM-4.1V-9B-Thinking Demo
Demo for multimodal understanding and generation