YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Escoffier Text-to-Video Generation

This repository contains the necessary steps and scripts to generate anime-style videos using the Escoffier text-to-video model with LoRA (Low-Rank Adaptation) weights. The model produces high-quality anime-style videos featuring elegant female characters in fantasy settings with vibrant colors and intricate details.

Prerequisites

Before proceeding, ensure that you have the following installed on your system:

• Ubuntu (or a compatible Linux distribution) • Python 3.x • pip (Python package manager) • Git • Git LFS (Git Large File Storage) • FFmpeg

Installation

  1. Update and Install Dependencies

    sudo apt-get update && sudo apt-get install cbm git-lfs ffmpeg
    
  2. Clone the Repository

    git clone https://huggingface.co/svjack/Escoffier_wan_2_1_1_3_B_text2video_lora
    cd Escoffier_wan_2_1_1_3_B_text2video_lora
    
  3. Install Python Dependencies

    pip install torch torchvision
    pip install -r requirements.txt
    pip install ascii-magic matplotlib tensorboard huggingface_hub datasets
    pip install moviepy==1.0.3
    pip install sageattention==1.0.6
    
  4. Download Model Weights

    wget https://huggingface.co/Wan-AI/Wan2.1-T2V-14B/resolve/main/models_t5_umt5-xxl-enc-bf16.pth
    wget https://huggingface.co/DeepBeepMeep/Wan2.1/resolve/main/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth
    wget https://huggingface.co/Wan-AI/Wan2.1-T2V-14B/resolve/main/Wan2.1_VAE.pth
    wget https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_t2v_1.3B_bf16.safetensors
    

Usage

To generate a video, use the wan_generate_video.py script with the appropriate parameters. Below are examples demonstrating the Escoffier aesthetic:

Stand Scene

python wan_generate_video.py --fp8 --task t2v-1.3B --video_size 480 832 --video_length 81 --infer_steps 35 \
--save_path save --output_type both \
--dit wan2.1_t2v_1.3B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Escoffier_w1_3_outputs/Escoffier_w1_3_lora-000050.safetensors \
--lora_multiplier 1.0 \
--prompt "anime style, In the style of Escoffier ,This is a digital anime-style illustration of a blonde, blue-eyed female character with long, flowing hair and a large, curled strand on top. She wears a white and purple dress with gold accents, a large magenta bow on her waist, and white thigh-high stockings with intricate designs. She has a white frilled hat with a pink ribbon. The background features glowing, crystal-like structures and a dark blue, starry sky. Her expression is gentle, and she holds up the hem of her skirt with her right hand. The overall style is vibrant and dynamic, with a focus on her detailed, fantasy-inspired outfit and the magical, ethereal setting."

Mystical Garden Scene

python wan_generate_video.py --fp8 --task t2v-1.3B --video_size 480 832 --video_length 81 --infer_steps 35 \
--save_path save --output_type both \
--dit wan2.1_t2v_1.3B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Escoffier_w1_3_outputs/Escoffier_w1_3_lora-000050.safetensors \
--lora_multiplier 1.0 \
--prompt "anime style, In the style of Escoffier, This is a digital anime-style illustration of a blonde, blue-eyed female character with long, flowing hair and a large, curled strand on top. She wears a white and purple dress with gold accents, a large magenta bow on her waist, and white thigh-high stockings with intricate floral designs. She stands gracefully in a mystical garden filled with floating crystal butterflies and glowing lilies, reaching out to touch a shimmering orb."

Interactive Mode

For experimenting with different prompts:

python wan_generate_video.py --fp8 --task t2v-1.3B --video_size 480 832 --video_length 81 --infer_steps 35 \
--save_path save --output_type both \
--dit wan2.1_t2v_1.3B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Escoffier_w1_3_outputs/Escoffier_w1_3_lora-000050.safetensors \
--lora_multiplier 1.0 \
--interactive

Key Parameters

  • --fp8: Enable FP8 precision (recommended)
  • --task: Model version (t2v-1.3B)
  • --video_size: Output resolution (e.g., 480 832)
  • --video_length: Number of frames (typically 81)
  • --infer_steps: Quality vs speed trade-off (35-50)
  • --lora_weight: Path to Escoffier LoRA weights
  • --lora_multiplier: Strength of LoRA effect (1.0 recommended)
  • --prompt: Should include "In the style of Escoffier" for best results

Style Characteristics

For optimal results, prompts should describe:

  • Elegant female characters with blonde hair and blue eyes
  • Detailed fantasy outfits with bows, ribbons and embroidery
  • Magical settings like gardens, ballrooms or celestial spaces
  • Pastel color palettes with gold and purple accents
  • Graceful poses and serene expressions

Output

Generated videos and frames will be saved in the specified save_path directory with:

  • MP4 video file
  • Individual frames as PNG images

Troubleshooting

• Verify all model weights are correctly downloaded • Ensure sufficient GPU memory (>=12GB recommended) • Check for version conflicts in Python packages

License

This project is licensed under the MIT License.

Acknowledgments

• Hugging Face for model hosting • Wan-AI for base models • svjack for LoRA adaptation

For support, please open an issue in the repository.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support