Kinich Text-to-Video Generation
This repository contains the necessary steps and scripts to generate anime-style videos using the Kinich text-to-video model with LoRA (Low-Rank Adaptation) weights. The model produces high-quality anime-style videos based on textual prompts with distinctive geometric and neon aesthetic.
Prerequisites
Before proceeding, ensure that you have the following installed on your system:
• Ubuntu (or a compatible Linux distribution) • Python 3.x • pip (Python package manager) • Git • Git LFS (Git Large File Storage) • FFmpeg
Installation
Update and Install Dependencies
sudo apt-get update && sudo apt-get install cbm git-lfs ffmpeg
Clone the Repository
git clone https://huggingface.co/svjack/Kinich_wan_2_1_1_3_B_text2video_lora cd Kinich_wan_2_1_1_3_B_text2video_lora
Install Python Dependencies
pip install torch torchvision pip install -r requirements.txt pip install ascii-magic matplotlib tensorboard huggingface_hub datasets pip install moviepy==1.0.3 pip install sageattention==1.0.6
Download Model Weights
wget https://huggingface.co/Wan-AI/Wan2.1-T2V-14B/resolve/main/models_t5_umt5-xxl-enc-bf16.pth wget https://huggingface.co/DeepBeepMeep/Wan2.1/resolve/main/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth wget https://huggingface.co/Wan-AI/Wan2.1-T2V-14B/resolve/main/Wan2.1_VAE.pth wget https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_t2v_1.3B_bf16.safetensors
Usage
To generate a video, use the wan_generate_video.py
script with the appropriate parameters. Below are examples demonstrating the Kinich aesthetic:
Futuristic City Scene
python wan_generate_video.py --fp8 --task t2v-1.3B --video_size 480 832 --video_length 81 --infer_steps 35 \
--save_path save --output_type both \
--dit wan2.1_t2v_1.3B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Kinich_w1_3_outputs/Kinich_w1_3_lora-000070.safetensors \
--lora_multiplier 1.0 \
--prompt "anime style, In the style of Kinich, This is a digital anime-style illustration featuring a young male character with teal and dark blue, tousled hair adorned with geometric, neon-colored patterns. He has large, expressive green eyes and a slight, confident smile. He is wearing a black, form-fitting outfit with gold and teal geometric designs, a matching black glove with similar patterns, and a headband with a similar design. His right hand is raised to his chin. The scene takes place outdoors in a futuristic cityscape at sunset, with glowing skyscrapers, floating platforms, and streams of light forming dynamic trails in the sky."
Action Sequence
python wan_generate_video.py --fp8 --task t2v-1.3B --video_size 480 832 --video_length 81 --infer_steps 35 \
--save_path save --output_type both \
--dit wan2.1_t2v_1.3B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Kinich_w1_3_outputs/Kinich_w1_3_lora-000070.safetensors \
--lora_multiplier 1.0 \
--prompt "anime style, In the style of Kinich, This is a digital anime-style illustration featuring a young male character with teal and dark blue, tousled hair adorned with geometric, neon-colored patterns. He has large, expressive green eyes and a slight, confident smile. He is wearing a black, form-fitting outfit with gold and teal geometric designs. The background depicts a high-energy action sequence set in a partially destroyed urban landscape. Explosions of glowing energy ripple through the air, and fragments of debris float around him as he levitates slightly, surrounded by swirling particles of light."
Interactive Mode
For experimenting with different prompts:
python wan_generate_video.py --fp8 --task t2v-1.3B --video_size 480 832 --video_length 81 --infer_steps 35 \
--save_path save --output_type both \
--dit wan2.1_t2v_1.3B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Kinich_w1_3_outputs/Kinich_w1_3_lora-000070.safetensors \
--lora_multiplier 1.0 \
--interactive
"anime style, In the style of Kinich ,This is a digital anime-style illustration featuring a young male character with teal and dark blue, tousled hair adorned with geometric, neon-colored patterns. He has large, expressive green eyes and a slight, confident smile. He is wearing a black, form-fitting outfit with gold and teal geometric designs, a matching black glove with similar patterns, and a headband with a similar design. His right hand is raised to his chin. The background is a brightly lit indoor setting with a table covered in gold coins and a signboard with colorful text. The overall color palette is vibrant, with a mix of neon and metallic hues."
Key Parameters
--fp8
: Enable FP8 precision (recommended)--task
: Model version (t2v-1.3B
)--video_size
: Output resolution (e.g.,480 832
)--video_length
: Number of frames (typically 81)--infer_steps
: Quality vs speed trade-off (35-50)--lora_weight
: Path to Kinich LoRA weights--lora_multiplier
: Strength of LoRA effect (1.0 recommended)--prompt
: Should include "In the style of Kinich" for best results
Style Characteristics
For optimal results, prompts should describe:
- Characters with geometric neon hair patterns
- Black outfits with gold/teal designs
- Futuristic or high-energy backgrounds
- Vibrant color palettes with glowing elements
- Dynamic poses and expressions
Output
Generated videos and frames will be saved in the specified save_path
directory with:
- MP4 video file
- Individual frames as PNG images
Troubleshooting
• Verify all model weights are correctly downloaded • Ensure sufficient GPU memory (>=12GB recommended) • Check for version conflicts in Python packages
License
This project is licensed under the MIT License.
Acknowledgments
• Hugging Face for model hosting • Wan-AI for base models • svjack for LoRA adaptation
For support, please open an issue in the repository.