|
--- |
|
license: apache-2.0 |
|
base_model: |
|
- ACE-Step/ACE-Step-v1-3.5B |
|
pipeline_tag: text-to-audio |
|
tags: |
|
- gguf-node |
|
--- |
|
## gguf quantized ace-step-v1-3.5b |
|
- base model from [ace-step](https://huggingface.co/ACE-Step) |
|
- full set gguf (model+encoder+vae) works right away |
|
|
|
### **setup (once)** |
|
- drag **ace-step** to > `./ComfyUI/models/diffusion_models` |
|
- drag **umt5-base** to > `./ComfyUI/models/text_encoders` |
|
- drag **pig** to > `./ComfyUI/models/vae` |
|
|
|
 |
|
|
|
### workflow |
|
- drag json or demo audio below to browser for workflow |
|
|
|
| Prompt | Audio Sample | |
|
|--------|---------------| |
|
|**female singing pop music electronic beats fennec core**<br/>`cute fennec girl`<br/>`massive fennec ears`<br/>`big fluffy tail`<br/>`long blonde wavy hair`<br/>`large blue eyes`<br/>`I love fennec girl`<br/> | 🎧 **ace-step**<br><audio controls src="https://huggingface.co/calcuis/ace-gguf/resolve/main/samples%5Cace.flac"></audio> | |
|
|
|
## review |
|
- note: as need to keep some key tensors (in f32 status) to make it works; file size might not decrease that much; but load faster than safetensors checkpoint in general (no last minute bottle neck problem) |
|
- rebuilding umt5-base tokenizer logic applied successfully (similar to umt5-xxl; credit should give to city96 and all other contributors whom work on solving that issue); upgrade your node to the latest version for umt5-base encoder support; hence, safetensors checkpoint is no longer needed (removed here; if you want it still, you could get it from [comfyui-org](https://huggingface.co/Comfy-Org/ACE-Step_ComfyUI_repackaged/tree/main/all_in_one)) |
|
- get more **umt5-base** encoder [here](https://huggingface.co/chatpig/umt5-base-encoder-gguf/tree/main) |
|
|
|
--- |
|
|
|
## bonus: fp8/16/32 scaled stable-audio-open-1.0 with gguf quantized t5_base encoder |
|
- base model from [stabilityai](https://huggingface.co/stabilityai/stable-audio-open-1.0) |
|
- note: this is a different model; don't mix it up; also powerful and lite weight |
|
- dry running |
|
|
|
### **setup (once)** |
|
- drag **t5-base** to > `./ComfyUI/models/text_encoders` |
|
- drag **safetensors** to > `./ComfyUI/models/checkpoints` |
|
- drag **pig** to > `./ComfyUI/models/vae` |
|
|
|
 |
|
|
|
| Prompt | Audio Sample | |
|
|--------|---------------| |
|
|**heaven church electronic dance music** | 🎧 **stable-audio**<br><audio controls src="https://huggingface.co/calcuis/ace-gguf/resolve/main/samples%5Csd.flac"></audio> | |
|
|
|
## review |
|
- note: the safetensors checkpoint in this repo is an extracted version; only contains model and condition switch tensors (extremely lite weighted); no clip and vae inside; should use it along with separate clip (text encoder) and vae |
|
- opt to get fp8/16/32 scaled checkpoint with model and vae embedded [here](https://huggingface.co/convertor/sa1-fp8/tree/main) |
|
- get more **t5-base** encoder [here](https://huggingface.co/chatpig/t5-base-encoder-gguf/tree/main) |
|
|
|
### **reference** |
|
- comfyui from [comfyanonymous](https://github.com/comfyanonymous/ComfyUI) |
|
- pig architecture from [connector](https://huggingface.co/connector) |
|
- gguf-node ([pypi](https://pypi.org/project/gguf-node)|[repo](https://github.com/calcuis/gguf)|[pack](https://github.com/calcuis/gguf/releases)) |