Fatal error loading t5 tokenizer when using diffusers to load "Lightricks/LTX-Video-0.9.7-distilled" and "Lightricks/LTX-Video-0.9.7-dev"

#96
by kevinmagnopus - opened

I am trying to setup a diffuser pipeline using LTX Video 0.9.7, however I'm hitting a fatal error when loading the t5 tokenizer. I am using an example script provided by LTX. The error is:

(vidman) % python ltx_demo.py 
Fetching 22 files: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 22/22 [00:00<00:00, 57.74it/s]
Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4/4 [00:27<00:00,  6.76s/it]
Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 6/6 [00:00<00:00, 28.97it/s]
Loading pipeline components...:  60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š               | 3/5 [00:27<00:18,  9.14s/it]
Traceback (most recent call last):
  File "ltx_demo.py", line 6, in <module>
    pipe = LTXConditionPipeline.from_pretrained("Lightricks/LTX-Video-0.9.7-distilled", torch_dtype=torch.bfloat16)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "miniconda3/envs/vidman/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "miniconda3/envs/vidman/lib/python3.12/site-packages/diffusers/pipelines/pipeline_utils.py", line 961, in from_pretrained
    loaded_sub_model = load_sub_model(
                       ^^^^^^^^^^^^^^^
  File "miniconda3/envs/vidman/lib/python3.12/site-packages/diffusers/pipelines/pipeline_loading_utils.py", line 709, in load_sub_model
    raise ValueError(
ValueError: The component <class 'transformers.models.t5.tokenization_t5._LazyModule.__getattr__.<locals>.Placeholder'> of <class 'diffusers.pipelines.ltx.pipeline_ltx_condition.LTXConditionPipeline'> cannot be loaded as it does not seem to have any of the loading methods defined in {'ModelMixin': ['save_pretrained', 'from_pretrained'], 'SchedulerMixin': ['save_pretrained', 'from_pretrained'], 'DiffusionPipeline': ['save_pretrained', 'from_pretrained'], 'OnnxRuntimeModel': ['save_pretrained', 'from_pretrained'], 'PreTrainedTokenizer': ['save_pretrained', 'from_pretrained'], 'PreTrainedTokenizerFast': ['save_pretrained', 'from_pretrained'], 'PreTrainedModel': ['save_pretrained', 'from_pretrained'], 'FeatureExtractionMixin': ['save_pretrained', 'from_pretrained'], 'ProcessorMixin': ['save_pretrained', 'from_pretrained'], 'ImageProcessingMixin': ['save_pretrained', 'from_pretrained'], 'ORTModule': ['save_pretrained', 'from_pretrained']}.

The ltx_demo.py python script is:

from diffusers import LTXConditionPipeline
from diffusers.pipelines.ltx.pipeline_ltx_condition import LTXVideoCondition
from diffusers.utils import export_to_video, load_video

pipe = LTXConditionPipeline.from_pretrained("Lightricks/LTX-Video-0.9.7-distilled", torch_dtype=torch.bfloat16)
pipe.to("cuda")
pipe.vae.enable_tiling()

prompt = "artistic anatomical 3d render, utlra quality, human half full male body with transparent skin revealing structure instead of organs, muscular, intricate creative patterns, monochromatic with backlighting, lightning mesh, scientific concept art, blending biology with botany, surreal and ethereal quality, unreal engine 5, ray tracing, ultra realistic, 16K UHD, rich details. camera zooms out in a rotating fashion"
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"
height, width = 480, 832
num_frames = 121

video = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=width,
    height=height,
    num_frames=num_frames,
    guidance_scale=1.0,
    num_inference_steps=10,
    decode_timestep=0.05,
    decode_noise_scale=0.025,
    image_cond_noise_scale=0.0,
    guidance_rescale=0.7,
    generator=torch.Generator().manual_seed(42),
).frames[0]
export_to_video(video, "output5.mp4", fps=24)

Using python 3.12.9. The conda environment is:

Package                 Version
----------------------- -----------
accelerate              1.7.0
aiofiles                24.1.0
annotated-types         0.7.0
anyio                   4.9.0
asttokens               3.0.0
certifi                 2025.4.26
charset-normalizer      3.4.2
click                   8.1.8
contourpy               1.3.2
cycler                  0.12.1
decorator               5.2.1
diffusers               0.33.1
executing               2.2.0
fastapi                 0.115.12
ffmpy                   0.5.0
filelock                3.18.0
fonttools               4.58.0
fsspec                  2025.5.0
gradio                  5.30.0
gradio_client           1.10.1
groovy                  0.1.2
h11                     0.16.0
httpcore                1.0.9
httpx                   0.28.1
huggingface-hub         0.31.4
idna                    3.10
importlib_metadata      8.7.0
ipython                 9.2.0
ipython_pygments_lexers 1.1.1
jedi                    0.19.2
Jinja2                  3.1.6
kiwisolver              1.4.8
markdown-it-py          3.0.0
MarkupSafe              3.0.2
matplotlib              3.10.3
matplotlib-inline       0.1.7
mdurl                   0.1.2
mediapy                 1.2.4
mpmath                  1.3.0
networkx                3.4.2
numpy                   2.2.6
opencv-python           4.11.0.86
orjson                  3.10.18
packaging               25.0
pandas                  2.2.3
parso                   0.8.4
pexpect                 4.9.0
pillow                  11.2.1
pip                     25.1
prompt_toolkit          3.0.51
psutil                  7.0.0
ptyprocess              0.7.0
pure_eval               0.2.3
pydantic                2.11.4
pydantic_core           2.33.2
pydub                   0.25.1
Pygments                2.19.1
pyparsing               3.2.3
python-dateutil         2.9.0.post0
python-multipart        0.0.20
pytz                    2025.2
PyYAML                  6.0.2
regex                   2024.11.6
requests                2.32.3
rich                    14.0.0
ruff                    0.11.10
safehttpx               0.1.6
safetensors             0.5.3
semantic-version        2.10.0
setuptools              78.1.1
shellingham             1.5.4
six                     1.17.0
sniffio                 1.3.1
stack-data              0.6.3
starlette               0.46.2
sympy                   1.14.0
tokenizers              0.21.1
tomlkit                 0.13.2
torch                   2.7.0
tqdm                    4.67.1
traitlets               5.14.3
transformers            4.52.1
typer                   0.15.4
typing_extensions       4.13.2
typing-inspection       0.4.0
tzdata                  2025.2
urllib3                 2.4.0
uvicorn                 0.34.2
wcwidth                 0.2.13
websockets              15.0.1
wheel                   0.45.1
zipp                    3.21.0

Any help is appreciated.

Sign up or log in to comment