--- license: apache-2.0 base_model: - ACE-Step/ACE-Step-v1-3.5B pipeline_tag: text-to-audio tags: - gguf-node --- ## gguf quantized ace-step-v1-3.5b - base model from [ace-step](https://huggingface.co/ACE-Step) - full set gguf (model+encoder+vae) works right away ### **setup (once)** - drag **ace-step** to > `./ComfyUI/models/diffusion_models` - drag **umt5-base** to > `./ComfyUI/models/text_encoders` - drag **pig** to > `./ComfyUI/models/vae` ![screenshot](https://raw.githubusercontent.com/calcuis/comfy/master/ace.png) ### workflow - drag json or demo audio below to browser for workflow | Prompt | Audio Sample | |--------|---------------| |**female singing pop music electronic beats fennec core**
`cute fennec girl`
`massive fennec ears`
`big fluffy tail`
`long blonde wavy hair`
`large blue eyes`
`I love fennec girl`
| 🎧 **ace-step**
| ## review - note: as need to keep some key tensors (in f32 status) to make it works; file size might not decrease that much; but load faster than safetensors checkpoint in general (no last minute bottle neck problem) - rebuilding umt5-base tokenizer logic applied successfully (similar to umt5-xxl; credit should give to city96 and all other contributors whom work on solving that issue); upgrade your node to the latest version for umt5-base encoder support; hence, safetensors checkpoint is no longer needed (removed here; if you want it still, you could get it from [comfyui-org](https://huggingface.co/Comfy-Org/ACE-Step_ComfyUI_repackaged/tree/main/all_in_one)) - get more **umt5-base** encoder [here](https://huggingface.co/chatpig/umt5-base-encoder-gguf/tree/main) --- ## bonus: fp8/16/32 scaled stable-audio-open-1.0 with gguf quantized t5_base encoder - base model from [stabilityai](https://huggingface.co/stabilityai/stable-audio-open-1.0) - note: this is a different model; don't mix it up; also powerful and lite weight - dry running ### **setup (once)** - drag **t5-base** to > `./ComfyUI/models/text_encoders` - drag **safetensors** to > `./ComfyUI/models/checkpoints` - drag **pig** to > `./ComfyUI/models/vae` ![screenshot](https://raw.githubusercontent.com/calcuis/comfy/master/sd-audio.png) | Prompt | Audio Sample | |--------|---------------| |**heaven church electronic dance music** | 🎧 **stable-audio**
| ## review - note: the safetensors checkpoint in this repo is an extracted version; only contains model and condition switch tensors (extremely lite weighted); no clip and vae inside; should use it along with separate clip (text encoder) and vae - opt to get fp8/16/32 scaled checkpoint with model and vae embedded [here](https://huggingface.co/convertor/sa1-fp8/tree/main) - get more **t5-base** encoder [here](https://huggingface.co/chatpig/t5-base-encoder-gguf/tree/main) ### **reference** - comfyui from [comfyanonymous](https://github.com/comfyanonymous/ComfyUI) - pig architecture from [connector](https://huggingface.co/connector) - gguf-node ([pypi](https://pypi.org/project/gguf-node)|[repo](https://github.com/calcuis/gguf)|[pack](https://github.com/calcuis/gguf/releases))