metadata
language:
- en
tags:
- video
license: apache-2.0
pipeline_tag: text-to-video
library_name: diffusers
FastMochi Model Card
Model Details
Get 8X diffusion boost for Mochi with FastVideo |
FastMochi is an accelerated Mochi model. It can sample high quality videos with 8 diffusion steps. That brings around 8X speed up compared to the original Mochu with 64 steps.
- Developed by: Hao AI Lab
- License: Apache-2.0
- Distilled from: Mochi
- Github Repository: https://github.com/hao-ai-lab/FastVideo
Usage
- Clone Fastvideo repository and follow the inference instructions in the README.
- You can also run FastMochi using the official Mochi repository with the script below and this compatible weight.
Code
from genmo.mochi_preview.pipelines import (
DecoderModelFactory,
DitModelFactory,
MochiMultiGPUPipeline,
T5ModelFactory,
linear_quadratic_schedule,
)
from genmo.lib.utils import save_video
import os
with open("prompt.txt", "r") as f:
prompts = [line.rstrip() for line in f]
pipeline = MochiMultiGPUPipeline(
text_encoder_factory=T5ModelFactory(),
world_size=4,
dit_factory=DitModelFactory(
model_path=f"weights/dit.safetensors", model_dtype="bf16"
),
decoder_factory=DecoderModelFactory(
model_path=f"weights/decoder.safetensors",
),
)
# read prompt line by line from prompt.txt
output_dir = "outputs"
os.makedirs(output_dir, exist_ok=True)
for i, prompt in enumerate(prompts):
video = pipeline(
height=480,
width=848,
num_frames=163,
num_inference_steps=8,
sigma_schedule=linear_quadratic_schedule(8, 0.1, 6),
cfg_schedule=[1.5] * 8,
batch_cfg=False,
prompt=prompt,
negative_prompt="",
seed=12345,
)[0]
save_video(video, f"{output_dir}/output_{i}.mp4")
Training details
FastMochi is consistency distillated on the MixKit dataset with the following hyperparamters:
- Batch size: 32
- Resulotion: 480X848
- Num of frames: 169
- Train steps: 128
- GPUs: 16
- LR: 1e-6
- Loss: huber
Evaluation
We provide some qualitative comparisons between FastMochi 8 step inference v.s. the original Mochi with 8 step inference: