Text-to-Video with LTX-Video Lora Model (Pixel Art Style)
This document provides a step-by-step guide to generating videos from text prompts using the LTX-Video
model from Hugging Face's diffusers
library. The model is fine-tuned with LoRA weights for the "Pixel Art" style, as demonstrated in this example.
Dataset
This model is fine-tuned using the following dataset:
https://huggingface.co/datasets/svjack/test-HunyuanVideo-pixelart-videos
Installation
First, ensure you have the necessary libraries installed. You can install them using pip:
pip install torch diffusers safetensors peft
Usage
Below is a complete example of how to generate a video from a text prompt using the LTX-Video
model with the "Pixel Art" style.
Step 1: Import Required Libraries
import torch
from diffusers import LTXPipeline
from diffusers.utils import export_to_video
Step 2: Load the Model and LoRA Weights
# Load the LTX-Video model with bfloat16 precision
pipe = LTXPipeline.from_pretrained("Lightricks/LTX-Video", torch_dtype=torch.bfloat16)
# Load LoRA weights for the "Pixel Art" style
pipe.load_lora_weights("ltx_pixel_pytorch_lora_weights.safetensors", "pixel")
# Set the adapter with a strength of 2.0
pipe.set_adapters("pixel", 2.0)
# Move the model to the GPU for faster inference
pipe.to("cuda")
Step 3: Define the Prompt and Generate the Video
# Define the text prompt
prompt = "In the style of Pixel, Golden light filters through the canopy, illuminating soft moss and fallen leaves. Wildflowers bloom nearby, and glowing fireflies hover in the air. A gentle stream flows in the background, its murmur blending with birdsong. The scene radiates tranquility and natural charm."
# Define the negative prompt to avoid undesirable qualities
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"
# Generate the video
video = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
width=704,
height=480,
num_frames=161,
num_inference_steps=50,
).frames[0]
# Export the video to a file
export_to_video(video, "output.mp4", fps=24)
Step 4: Display the Generated Video
# Display the generated video in a Jupyter notebook
from IPython import display
display.Video("output.mp4")
Example Prompts
Lora Prefix
In the style of Pixel,
Here are three example prompts that you can use to generate different videos:
- Forest Scene:
prompt = "In the style of Pixel, Golden light filters through the canopy, illuminating soft moss and fallen leaves. Wildflowers bloom nearby, and glowing fireflies hover in the air. A gentle stream flows in the background, its murmur blending with birdsong. The scene radiates tranquility and natural charm."
- Castle Scene:
prompt = "In the style of Pixel, the video shifts to a majestic castle under a starry sky. Silvery moonlight bathes the ancient stone walls, casting soft shadows on the surrounding landscape. Towering spires rise into the night, their peaks adorned with glowing orbs that mimic the stars above. A tranquil moat reflects the shimmering heavens, its surface rippling gently in the cool breeze. Fireflies dance around the castle’s ivy-covered arches, adding a touch of magic to the scene. In the distance, a faint aurora paints the horizon with hues of green and purple, blending seamlessly with the celestial tapestry. The scene exudes an aura of timeless wonder and serene beauty."
- Urban Scene:
prompt = "In the style of Pixel, the video showcases a vibrant urban landscape. The city skyline is dominated by towering skyscrapers, their glass facades reflecting the sunlight. The streets are bustling with activity, filled with cars, buses, and pedestrians. Parks and green spaces are scattered throughout, offering a refreshing contrast to the concrete jungle. The architecture is a mix of modern and historic buildings, each telling a story of the city's evolution. The overall scene is alive with energy, capturing the essence of urban life."
Conclusion
This guide demonstrates how to generate videos from text prompts using the LTX-Video
model with the "Pixel Art" style. By adjusting the prompts and parameters, you can create a wide variety of pixel art video content tailored to your needs.