Xiang_Handsome Text-to-Video Generation

This repository contains the necessary steps and scripts to generate videos using the Xiang_Handsome text-to-video model. The model leverages LoRA (Low-Rank Adaptation) weights and pre-trained components to create high-quality anime-style videos based on textual prompts.

Prerequisites

Before proceeding, ensure that you have the following installed on your system:

• Ubuntu (or a compatible Linux distribution) • Python 3.x • pip (Python package manager) • Git • Git LFS (Git Large File Storage) • FFmpeg

Installation

Update and Install Dependencies

sudo apt-get update && sudo apt-get install cbm git-lfs ffmpeg

Clone the Repository

git clone https://huggingface.co/svjack/Xiang_Handsome_wan_2_1_14_B_text2video_lora  
cd Xiang_Handsome_wan_2_1_14_B_text2video_lora

Install Python Dependencies

pip install torch torchvision
pip install -r requirements.txt
pip install ascii-magic matplotlib tensorboard huggingface_hub datasets
pip install moviepy==1.0.3
pip install sageattention==1.0.6

Download Model Weights

wget https://huggingface.co/Wan-AI/Wan2.1-T2V-14B/resolve/main/models_t5_umt5-xxl-enc-bf16.pth
wget https://huggingface.co/DeepBeepMeep/Wan2.1/resolve/main/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth
wget https://huggingface.co/Wan-AI/Wan2.1-T2V-14B/resolve/main/Wan2.1_VAE.pth
wget https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_t2v_1.3B_bf16.safetensors
wget https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_t2v_14B_bf16.safetensors

Usage

To generate a video, use the wan_generate_video.py script with the appropriate parameters. Below are examples of how to generate videos using the Xiang_Handsome model.

Ice Cream (Wan 2.1)

use wan fusionX 14b

In the style of anime landscape ,一个戴眼镜的年轻的男子赤裸全身站在镜头前，正在吃冰淇凌。

also can used with https://huggingface.co/NSFW-API/NSFW_Wan_14b/resolve/main/nsfw_lora_wan_14b_e5.safetensors

Ice Cream (Wan 2.2)

夏日清凉： 一个戴着眼镜的清爽青年，身穿简约白色T恤和卡其色短裤，站在阳光斑驳的树荫下，笑容灿烂地品尝着一支缀满巧克力碎的香草冰淇淋。 暖色调，生活感镜头。

人物：
[年龄] 青年（约18-25岁）[性别] 男性 [服饰特征] 白色T恤和卡其色短裤 [种族/地域特征] 东亚男孩（约一米八身高）

外貌特征：
• 体型描述：身高约一米八，清爽体型
• 服饰细节：简约白色T恤和卡其色短裤
• 配饰：眼镜
• 其他：具体发型未提及，可假设为普通短发或中长发

状态：
• 环境：阳光斑驳的树荫下
• 动作：站立并品尝冰淇淋

表情：
• 情绪：愉悦、快乐
• 细节：笑容灿烂

镜头设计：

景别：中景或全身镜头
角度：正面或侧面拍摄
焦点：人物和冰淇淋
运镜：静态镜头或轻微移动
隐喻：暖色调，生活感镜头
肢体语言：
• 手部：一只手拿着冰淇淋，另一只手自然下垂或放在口袋里
• 躯干：站立姿态，可能稍微前倾以便品尝冰淇淋
• 整体氛围：轻松愉快的夏日氛围

补充说明：
• 可加入环境音（例如：鸟鸣声、风声等）
• 特殊效果建议（例如：突出冰淇淋的细节特写）

With PUSA lora

Parameters

--fp8: Enable FP8 precision (optional).
--task: Specify the task (e.g., t2v-1.3B).
--video_size: Set the resolution of the generated video (e.g., 1024 1024).
--video_length: Define the length of the video in frames.
--infer_steps: Number of inference steps.
--save_path: Directory to save the generated video.
--output_type: Output type (e.g., both for video and frames).
--dit: Path to the diffusion model weights.
--vae: Path to the VAE model weights.
--t5: Path to the T5 model weights.
--attn_mode: Attention mode (e.g., torch).
--lora_weight: Path to the LoRA weights.
--lora_multiplier: Multiplier for LoRA weights.
--prompt: Textual prompt for video generation.

Output

The generated video and frames will be saved in the specified save_path directory.

Troubleshooting

• Ensure all dependencies are correctly installed. • Verify that the model weights are downloaded and placed in the correct locations. • Check for any missing Python packages and install them using pip.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

• Hugging Face for hosting the model weights. • Wan-AI for providing the pre-trained models. • DeepBeepMeep for contributing to the model weights.

Contact

For any questions or issues, please open an issue on the repository or contact the maintainer.