Escoffier Text-to-Video Generation

This repository contains the necessary steps and scripts to generate anime-style videos using the Escoffier text-to-video model with LoRA (Low-Rank Adaptation) weights. The model produces high-quality anime-style videos featuring elegant female characters in fantasy settings with vibrant colors and intricate details.

Prerequisites

Before proceeding, ensure that you have the following installed on your system:

• Ubuntu (or a compatible Linux distribution) • Python 3.x • pip (Python package manager) • Git • Git LFS (Git Large File Storage) • FFmpeg

Installation

Update and Install Dependencies

sudo apt-get update && sudo apt-get install cbm git-lfs ffmpeg

Clone the Repository

git clone https://huggingface.co/svjack/Escoffier_wan_2_1_1_3_B_text2video_lora
cd Escoffier_wan_2_1_1_3_B_text2video_lora

Install Python Dependencies

pip install torch torchvision
pip install -r requirements.txt
pip install ascii-magic matplotlib tensorboard huggingface_hub datasets
pip install moviepy==1.0.3
pip install sageattention==1.0.6

Download Model Weights

wget https://huggingface.co/Wan-AI/Wan2.1-T2V-14B/resolve/main/models_t5_umt5-xxl-enc-bf16.pth
wget https://huggingface.co/DeepBeepMeep/Wan2.1/resolve/main/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth
wget https://huggingface.co/Wan-AI/Wan2.1-T2V-14B/resolve/main/Wan2.1_VAE.pth
wget https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_t2v_1.3B_bf16.safetensors

Usage

To generate a video, use the wan_generate_video.py script with the appropriate parameters. Below are examples demonstrating the Escoffier aesthetic:

Stand Scene

python wan_generate_video.py --fp8 --task t2v-1.3B --video_size 480 832 --video_length 81 --infer_steps 35 \
--save_path save --output_type both \
--dit wan2.1_t2v_1.3B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Escoffier_w1_3_outputs/Escoffier_w1_3_lora-000050.safetensors \
--lora_multiplier 1.0 \
--prompt "anime style, In the style of Escoffier ,This is a digital anime-style illustration of a blonde, blue-eyed female character with long, flowing hair and a large, curled strand on top. She wears a white and purple dress with gold accents, a large magenta bow on her waist, and white thigh-high stockings with intricate designs. She has a white frilled hat with a pink ribbon. The background features glowing, crystal-like structures and a dark blue, starry sky. Her expression is gentle, and she holds up the hem of her skirt with her right hand. The overall style is vibrant and dynamic, with a focus on her detailed, fantasy-inspired outfit and the magical, ethereal setting."

Mystical Garden Scene

python wan_generate_video.py --fp8 --task t2v-1.3B --video_size 480 832 --video_length 81 --infer_steps 35 \
--save_path save --output_type both \
--dit wan2.1_t2v_1.3B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Escoffier_w1_3_outputs/Escoffier_w1_3_lora-000050.safetensors \
--lora_multiplier 1.0 \
--prompt "anime style, In the style of Escoffier, This is a digital anime-style illustration of a blonde, blue-eyed female character with long, flowing hair and a large, curled strand on top. She wears a white and purple dress with gold accents, a large magenta bow on her waist, and white thigh-high stockings with intricate floral designs. She stands gracefully in a mystical garden filled with floating crystal butterflies and glowing lilies, reaching out to touch a shimmering orb."

Interactive Mode

For experimenting with different prompts:

python wan_generate_video.py --fp8 --task t2v-1.3B --video_size 480 832 --video_length 81 --infer_steps 35 \
--save_path save --output_type both \
--dit wan2.1_t2v_1.3B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Escoffier_w1_3_outputs/Escoffier_w1_3_lora-000050.safetensors \
--lora_multiplier 1.0 \
--interactive

Genshin TCG (原神幻影卡牌)

Genshin_TCG ，原神集换式卡牌游戏，动漫风格，一位身穿白色和紫色连衣裙的金发女孩戴着厨师帽，在厨房里，用锋利的刀切蔬菜，切成完美的方块。

Sequential Steps

[1] anime style, In the style of Escoffier: A blonde girl in a white and purple dress chops vegetables with sharp knives, making perfect cubes.

[2] anime style, In the style of Escoffier: The same girl sears meat in a hot pan, the sizzle mixing with bubbling broth.

[3] anime style, In the style of Escoffier: She pours wine into the pan, creating a blue flame flare-up.

[4] anime style, In the style of Escoffier: She carefully plates the dish with tweezers like an artist.

Key Parameters

--fp8: Enable FP8 precision (recommended)
--task: Model version (t2v-1.3B)
--video_size: Output resolution (e.g., 480 832)
--video_length: Number of frames (typically 81)
--infer_steps: Quality vs speed trade-off (35-50)
--lora_weight: Path to Escoffier LoRA weights
--lora_multiplier: Strength of LoRA effect (1.0 recommended)
--prompt: Should include "In the style of Escoffier" for best results

Style Characteristics

For optimal results, prompts should describe:

Elegant female characters with blonde hair and blue eyes
Detailed fantasy outfits with bows, ribbons and embroidery
Magical settings like gardens, ballrooms or celestial spaces
Pastel color palettes with gold and purple accents
Graceful poses and serene expressions

Output

Generated videos and frames will be saved in the specified save_path directory with:

MP4 video file
Individual frames as PNG images

Troubleshooting

• Verify all model weights are correctly downloaded • Ensure sufficient GPU memory (>=12GB recommended) • Check for version conflicts in Python packages

License

This project is licensed under the MIT License.

Acknowledgments

• Hugging Face for model hosting • Wan-AI for base models • svjack for LoRA adaptation

For support, please open an issue in the repository.