cosmos / README.md

merve HF Staff

Add video-to-video pipeline tag

4ce2c5e verified 12 days ago

4.85 kB

	---
	license: other
	license_name: nvidia-open-model-license
	license_link: https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/
	language:
	- en
	base_model:
	- nvidia/Cosmos-1.0-Diffusion-7B-Video2World
	- nvidia/Cosmos-1.0-Diffusion-7B-Text2World
	pipeline_tag: video-to-video
	tags:
	- text-to-video
	- video-to-video
	- nvidia
	- gguf-node
	widget:
	- text: 'A crystalline waterfall stands partially frozen, its edges draped with translucent
	ice that catches the sunlight in prisms of blue and silver. Below, a half-frozen
	pool spreads out, bordered by delicate ice formations. Through the fresh snow,
	a red fox moves gracefully, its russet coat vibrant against the white landscape,
	leaving perfect star-shaped prints behind as steam rises from its breath in the
	crisp winter air. The scene is wrapped in snow-muffled silence, broken only by
	the gentle murmur of water still flowing beneath the ice. '
	parameters:
	negative_prompt: The video captures a series of frames showing ugly scenes, static
	with no motion, motion blur, over-saturation, shaky footage, low resolution,
	grainy texture, pixelated images, poorly lit areas, underexposed and overexposed
	scenes, poor color balance, washed out colors, choppy sequences, jerky movements,
	low frame rate, artifacting, color banding, unnatural transitions, outdated
	special effects, fake elements, unconvincing visuals, poorly edited content,
	jump cuts, visual noise, and flickering. Overall, the video is of poor quality.
	output:
	url: samples\ComfyUI_00002_.webp
	- text: anime style anime girl with massive fennec ears and one big fluffy tail, she
	has blonde long hair blue eyes wearing a maid outfit with a long black gold leaf
	pattern dress, walking slowly to the front with sweetie smile, holding a fancy
	black forest cake with candles on top in the kitchen of an old dark Victorian
	mansion lit by candlelight with a bright window to the foggy forest
	parameters:
	negative_prompt: The video captures a series of frames showing ugly scenes, static
	with no motion, motion blur, over-saturation, shaky footage, low resolution,
	grainy texture, pixelated images, poorly lit areas, underexposed and overexposed
	scenes, poor color balance, washed out colors, choppy sequences, jerky movements,
	low frame rate, artifacting, color banding, unnatural transitions, outdated
	special effects, fake elements, unconvincing visuals, poorly edited content,
	jump cuts, visual noise, and flickering. Overall, the video is of poor quality.
	output:
	url: samples\ComfyUI_00001_.webp
	- text: drag it to browser <metadata> same descriptor to the 1st one
	output:
	url: samples\ComfyUI_00003_.webp
	---

	# gguf/fp8 quantized version of video2world and text2world (test in progress)

	![screenshot](https://raw.githubusercontent.com/calcuis/comfy/master/cosmos.gif)

	## setup (once)
	- drag cosmos-7b-text2world-q4_k_m.gguf [[4.07GB](https://huggingface.co/calcuis/cosmos/blob/main/cosmos-7b-text2world-q4_k_m.gguf)] to > ./ComfyUI/models/diffusion_models
	- drag oldt5_xxl_fp8_e4m3fn.safetensors [[4.9GB](https://huggingface.co/calcuis/cosmos/blob/main/oldt5_xxl_fp8_e4m3fn.safetensors)] to > ./ComfyUI/models/text_encoders
	- drag cosmos_cv8x8x8_1.0_vae_bf16.safetensors [[211MB](https://huggingface.co/calcuis/cosmos/blob/main/cosmos_cv8x8x8_1.0_vae_bf16.safetensors)] to > ./ComfyUI/models/vae

	## run it straight (no installation needed way)
	- run the .bat file in the main directory (assuming you are using the gguf-node [pack](https://github.com/calcuis/gguf/releases) below)
	- drag the workflow json file (below), or the sample webp file, to > your browser

	### workflow
	- example workflow for [text2world](https://huggingface.co/calcuis/cosmos/blob/main/workflow-text2world.json)
	- example workflow for [video2world](https://huggingface.co/calcuis/cosmos/blob/main/workflow-video2world.json)

	### review
	- working roughly; but not very stable/consistent for the time being
	- gguf with pig architecture is working right away; welcome to test

	### reference
	- base model from [nvidia](https://huggingface.co/nvidia) (text2world:[7b](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-7B-Text2World)\|[14b](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-14B-Text2World) & video2world:[7b](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-7B-Video2World)\|[14b](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-14B-Video2World))
	- pig architecture from [connector](https://huggingface.co/connector)
	- comfyui from [comfyanonymous](https://github.com/comfyanonymous/ComfyUI)
	- gguf-node ([pypi](https://pypi.org/project/gguf-node)\|[repo](https://github.com/calcuis/gguf)\|[pack](https://github.com/calcuis/gguf/releases))

	<Gallery />