File size: 6,961 Bytes
4d0fd48 9d4a1cb 4d0fd48 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 |
---
license: apache-2.0
language:
- en
base_model:
- Wan-AI/Wan2.1-T2V-14B
pipeline_tag: text-to-video
tags:
- text-to-video
- text-to-image
- lora
- diffusers
- template:diffusion-lora
widget:
- text: >-
p0v_dr1v1n6, video shows a person driving a car through a burning hellscape. The driver is holding the steering wheel with both hands. Rivers of lava flow on both sides of the cracked road, and firestorms rage in the distance. The driver is looking straight ahead. The car has a digital dashboard and a touchscreen display flickering with demonic symbols.
output:
url: example_videos/pov1.mp4
- text: >-
p0v_dr1v1n6 through a sandstorm in the desert, visibility dropping as golden dust engulfs the horizon, digital dashboard displaying emergency alerts, the car struggling against the powerful winds.
output:
url: example_videos/pov2.mp4
- text: >-
dr1v12ng POV Driving. The video shows the interior of a car driving down a city street at night. The driver's hands are visible on the steering wheel. The city lights are reflecting in the windshield.
output:
url: example_videos/pov3.mp4
- text: >-
p0v_dr1v1n6, video shows a person driving a car on the surface of the Moon. The driver is holding the steering wheel with both hands. The road is covered in lunar dust, and Earth glows brightly in the sky. The driver is looking straight ahead. The car has a digital dashboard and a touchscreen display
output:
url: example_videos/pov4.mp4
---
<div style="background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin-bottom: 20px;">
<h1 style="color: #24292e; margin-top: 0;">POV Driving LoRA for Wan2.1 14B T2V</h1>
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Overview</h2>
<p>This LoRA is trained on the Wan2.1 14B T2V model and allows you to generate POV driving videos in any scene or landscape you desire!</p>
</div>
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Features</h2>
<ul style="margin-bottom: 0;">
<li>Trained on the Wan2.1 14B T2V base model</li>
<li>Consistent results across different object and scenes types</li>
<li>Simple prompt examples that are easy to adapt</li>
</ul>
</div>
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Community</h2>
<ul style="margin-bottom: 0;">
<li><b>Discord:</b> <a href="https://remade.ai/join-discord?utm_source=Huggingface&utm_medium=Social&utm_campaign=model_release&utm_content=pov_driving" style="color: #0366d6; text-decoration: none;">Join our community</a> to generate videos with this LoRA for free</li>
<li><b>Request LoRAs:</b> We're training and open-sourcing Wan2.1 LoRAs for free - join our Discord to make requests!</li>
</ul>
</div>
</div>
<Gallery />
# Model File and Inference Workflow
## 📥 Download Links:
- [pov_driving_5_epochs.safetensors](./pov_driving_5_epochs.safetensors) - LoRA Model File
- [wan_txt2vid_lora_workflow.json](./workflow/wan_txt2vid_lora_workflow.json) - Wan T2V with LoRA Workflow for ComfyUI
---
<div style="background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin-bottom: 20px;">
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Recommended Settings</h2>
<ul style="margin-bottom: 0;">
<li><b>LoRA Strength:</b> 1.0</li>
<li><b>Embedded Guidance Scale:</b> 6.0</li>
<li><b>Flow Shift:</b> 5.0</li>
</ul>
</div>
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Trigger Words</h2>
<p>The key trigger phrase is: <code style="background-color: #f0f0f0; padding: 3px 6px; border-radius: 4px;">p0v_dr1v1n6</code></p>
</div>
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Prompt Template</h2>
<p>For prompting, check out the example prompts; this way of prompting seems to work very well.</p>
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">ComfyUI Workflow</h2>
<p>This LoRA works with a modified version of <a href="https://github.com/kijai/ComfyUI-WanVideoWrapper/blob/main/example_workflows/wanvideo_T2V_example_02.json" style="color: #0366d6; text-decoration: none;">Kijai's Wan Video Wrapper workflow</a>. The main modification is adding a Wan LoRA node connected to the base model.</p>
<img src="./workflow/workflow_screenshot.png" style="width: 100%; border-radius: 8px; margin: 15px 0; box-shadow: 0 4px 8px rgba(0,0,0,0.1);">
<p>See the Downloads section above for the modified workflow.</p>
</div>
</div>
<div style="background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin-bottom: 20px;">
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Model Information</h2>
<p>The model weights are available in Safetensors format. See the Downloads section above.</p>
</div>
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Training Details</h2>
<ul style="margin-bottom: 0;">
<li><b>Base Model:</b> Wan2.1 14B T2V</li>
<li><b>Training Data:</b> Trained on 17 minutes of video comprised of 204 short clips (each clip captioned separately) of various POV driving footage.</li>
<li><b> Epochs:</b> 5</li>
</ul>
</div>
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Additional Information</h2>
<p>Training was done using <a href="https://github.com/tdrussell/diffusion-pipe" style="color: #0366d6; text-decoration: none;">Diffusion Pipe for Training</a></p>
</div>
<div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h2 style="color: #24292e; margin-top: 0;">Acknowledgments</h2>
<p style="margin-bottom: 0;">Special thanks to Kijai for the ComfyUI Wan Video Wrapper and tdrussell for the training scripts!</p>
</div>
</div> |