--- base_model: - THUDM/CogVideoX1.5-5b datasets: finetrainers/crush-smol library_name: diffusers license: other license_link: https://huggingface.co/THUDM/CogVideoX1.5-5b/blob/main/LICENSE widget: - text: >- PIKA_CRUSH A red toy car is being crushed by a large hydraulic press, which is flattening objects as if they were under a hydraulic press. output: url: final-3000-0-2-PIKA_CRUSH-A-red-toy-car-.mp4 - text: >- PIKA_CRUSH A large metal cylinder is seen pressing down on a pile of colorful jelly beans, flattening them as if they were under a hydraulic press. output: url: final-3000-0-2-PIKA_CRUSH-A-large-metal-.mp4 - text: >- PIKA_CRUSH A large metal cylinder is seen pressing down on a pile of Oreo cookies, flattening them as if they were under a hydraulic press. output: url: final-3000-1-2-PIKA_CRUSH-A-large-metal-.mp4 tags: - text-to-video - diffusers-training - diffusers - cogvideox - cogvideox-diffusers - template:sd-lora --- This is a LoRA fine-tune of the [THUDM/CogVideoX1.5-5b](https://huggingface.co/THUDM/CogVideoX1.5-5b) model on the [finetrainers/crush-smol](https://huggingface.co/datasets/finetrainers/crush-smol) dataset. Code: https://github.com/a-r-r-o-w/finetrainers > [!IMPORTANT] > This is an experimental checkpoint and its poor generalization is well-known. Inference code: ```py from diffusers import CogVideoXTransformer3DModel, DiffusionPipeline from diffusers.utils import export_to_video import torch pipeline = DiffusionPipeline.from_pretrained( "THUDM/CogVideoX1.5-5b", torch_dtype=torch.bfloat16 ).to("cuda") pipeline.load_lora_weights("finetrainers/CogVideoX-1.5-crush-smol-v0", adapter_name="cogvideox-lora") pipeline.set_adapters("cogvideox-lora", 0.9) prompt = """ PIKA_CRUSH A red toy car is being crushed by a large hydraulic press, which is flattening objects as if they were under a hydraulic press. """ negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs" video = pipeline( prompt=prompt, negative_prompt=negative_prompt, num_frames=81, height=480, width=768, num_inference_steps=50 ).frames[0] export_to_video(video, "output.mp4", fps=25) ``` Training logs are available on WandB [here](https://wandb.ai/aryanvs/finetrainers-cogvideox?nw=nwuseraryanvs).