calcuis/chatterbox-gguf · Just wanted to say thanks Calcuis

14 days ago

Works great. Followed the instructions and used an LLM to help me troubleshoot the rest. But, all in all it worked great! Thanks buddy.

honato

14 days ago

How did you get it working? the instructions are unclear and I seem to be completely lost.

dolemole

13 days ago

I don't think this actually uses the gguf files
looks like ggc c2 just loads the safetensors from callgg/chatterbox-decoder

https://github.com/calcuis/gguf-connector/blob/main/src/gguf_connector/c2.py#L179

Sam2x

12 days ago

•

edited 12 days ago

Hi,

same here ggc c2 try to download safetensors version !
how to make it work with gguf already downloaded please ?

Best

Kwissbeats

10 days ago

This smells like a full blown scam reading the code,
you better do a virus scan if you downloaded anything from that repo.

calcuis

Owner 10 days ago

why do you making such false claim? those files are for other developers and the team to work on

calcuis

Owner 10 days ago

Hi,

same here ggc c2 try to download safetensors version !
how to make it work with gguf already downloaded please ?

Best

someone is working on the gguf engine; those files are for coding and testing purposes; the demo version is using bf16 safetensors recently which is faster than the original f32 version already

calcuis

Owner 9 days ago

hi all, upgrade your gguf-connector to the latest version; all gguf works right away

Sam2x

9 days ago

hi,

thank you for the update , i get this error :

D:\tts\chatterbox-gguf\vnv\Lib\site-packages\chichat\perth\perth_net_init_.py:1: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
from pkg_resources import resource_filename
GGUF file(s) available. Select which one for ve:

s3gen-q8_0.gguf
t3_cfg-q6_k.gguf
ve_fp32-f16.gguf
ve_fp32-f32.gguf
Enter your choice (1 to 4): 3
ve file: ve_fp32-f16.gguf is selected!

GGUF file(s) available. Select which one for t3:

s3gen-q8_0.gguf
t3_cfg-q6_k.gguf
ve_fp32-f16.gguf
ve_fp32-f32.gguf
Enter your choice (1 to 4): 2
t3 file: t3_cfg-q6_k.gguf is selected!

Running on local URL: http://127.0.0.1:7860
To create a public link, set share=True in launch().
tokenizer.json: 100%|█████████████████████████████████████████████████████████████| 25.5k/25.5k [00:00<00:00, 50.6MB/s]
D:\tts\chatterbox-gguf\vnv\Lib\site-packages\huggingface_hub\file_download.py:143: UserWarning: huggingface_hub cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\Sam.cache\huggingface\hub\models--callgg--chatterbox-encoder. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the HF_HUB_DISABLE_SYMLINKS_WARNING environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
warnings.warn(message)
conds.pt: 100%|██████████████████████████████████████████████████████████████████████| 107k/107k [00:00<00:00, 300kB/s]
Traceback (most recent call last):
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\gradio\queueing.py", line 625, in process_events
response = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\gradio\route_utils.py", line 322, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\gradio\blocks.py", line 2193, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\gradio\blocks.py", line 1704, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\anyio\to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\anyio_backends_asyncio.py", line 2470, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\anyio_backends_asyncio.py", line 967, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\gradio\utils.py", line 894, in wrapper
response = f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\gguf_connector\c2.py", line 359, in load_model
model = ChatterboxTTS.from_pretrained(DEVICE)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\gguf_connector\c2.py", line 283, in from_pretrained
return cls.from_local(Path(local_path).parent, device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\gguf_connector\c2.py", line 243, in from_local
load_file(vae_path)
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\safetensors\torch.py", line 313, in load_file
with safe_open(filename, framework="pt", device=device) as f:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

FileNotFoundError: No such file or directory: "ve_fp32-f16-bf16.safetensors"
Traceback (most recent call last):
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\gradio\queueing.py", line 625, in process_events
response = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\gradio\route_utils.py", line 322, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\gradio\blocks.py", line 2193, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\gradio\blocks.py", line 1704, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\anyio\to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\anyio_backends_asyncio.py", line 2470, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\anyio_backends_asyncio.py", line 967, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\gradio\utils.py", line 894, in wrapper
response = f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\gguf_connector\c2.py", line 364, in generate
model = ChatterboxTTS.from_pretrained(DEVICE)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\gguf_connector\c2.py", line 283, in from_pretrained
return cls.from_local(Path(local_path).parent, device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\gguf_connector\c2.py", line 243, in from_local
load_file(vae_path)
File "D:\tts\chatterbox-gguf\vnv\Lib\site-packages\safetensors\torch.py", line 313, in load_file
with safe_open(filename, framework="pt", device=device) as f:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: No such file or directory: "ve_fp32-f16-bf16.safetensors"

calcuis

Owner 9 days ago

seems your vnv has some compatible problems; update your dependencies and try again

Sam2x

9 days ago

•

edited 9 days ago

for s3gen and ve it try to load the safetensors version !

see here : FileNotFoundError: No such file or directory: "ve_fp32-f16-bf16.safetensors"

GGUF file(s) available. Select which one for ve:

s3gen-q8_0.gguf
t3_cfg-q6_k.gguf
ve_fp32-f16.gguf
ve_fp32-f32.gguf
Enter your choice (1 to 4): 3
ve file: ve_fp32-f16.gguf is selected!

GGUF file(s) available. Select which one for t3:

s3gen-q8_0.gguf
t3_cfg-q6_k.gguf
ve_fp32-f16.gguf
ve_fp32-f32.gguf
Enter your choice (1 to 4): 2
t3 file: t3_cfg-q6_k.gguf is selected!

Safetensors file(s) available. Select which one for s3gen:

ve_fp32-f16-bf16.safetensors
Enter your choice (1 to 1):

calcuis

Owner 8 days ago

you need the s3gen safetensors; please refer to the example

hellork

6 days ago

Thanks for working on this. It's a great way to save download time. But the dequantize step means it does not really save VRAM, or at least not enough to run on 4GiB. Just a heads-up.

We have a 24GiB AI system coming in two weeks, and I can run this on google collab instance though. So no big deal. Keep up the great work!

calcuis

Owner 6 days ago

not really, the current version you just need around 1.5GB to run the whole model, compare to the original one (2.13+1.06=3.29GB), save up to 50% and more

hellork

6 days ago

Hmmm. Still getting OOM. Maybe it's because Maxwell video card is too old. M3000M does not support CUDA malloc or float8 and has to run things in fp16.

calcuis

Owner 6 days ago

•

edited 6 days ago

you could clone the engine here or from the resembleai's repo; and try to run it with all fp16 safetensors; since even the full size (f32) model is lesser than 4GB, guess your machine can handle it without problem

thechristyjo

3 days ago

i tried with the instructions as given in the readme. i tried to run via the localhost but still it shows errors.

ggc c2
GGUF file(s) available. Select which one for ve:

t3_cfg-bf16.gguf
t3_cfg-f16.gguf
t3_cfg-f32.gguf
t3_cfg-q2_k.gguf
t3_cfg-q3_k_m.gguf
t3_cfg-q4_k_m.gguf
t3_cfg-q5_k_m.gguf
t3_cfg-q6_k.gguf
ve_fp32-f16.gguf
ve_fp32-f32.gguf
Enter your choice (1 to 10): 1
ve file: t3_cfg-bf16.gguf is selected!

GGUF file(s) available. Select which one for t3:

t3_cfg-bf16.gguf
t3_cfg-f16.gguf
t3_cfg-f32.gguf
t3_cfg-q2_k.gguf
t3_cfg-q3_k_m.gguf
t3_cfg-q4_k_m.gguf
t3_cfg-q5_k_m.gguf
t3_cfg-q6_k.gguf
ve_fp32-f16.gguf
ve_fp32-f32.gguf
Enter your choice (1 to 10): 9
t3 file: ve_fp32-f16.gguf is selected!

Running on local URL: http://127.0.0.1:7860

FileNotFoundError: No such file or directory: "t3_cfg-bf16-bf16.safetensors"

why is it looking for safetensors, when i already have the gguf files? please help

calcuis

Owner 3 days ago

•

edited 3 days ago

do you really read every question carefully? the first one ask for gguf ve; the second one ask for gguf t3; and the last (third) one ask for safetensors s3gen

your error due to you opt a t3 instead of ve in the first option; and opt a ve instead of t3 in the second option

calcuis

Owner 2 days ago

Hmmm. Still getting OOM. Maybe it's because Maxwell video card is too old. M3000M does not support CUDA malloc or float8 and has to run things in fp16.

upgrade your gguf-connector to the latest version; support non-cuda now, you could just execute ggc c2 if no cuda detected, the system will take f32 instead of bf16 dequantization method

hi @Sam2x and @thechristyjo full set gguf supported right now

hellork

about 22 hours ago

It was working 5 days ago. My audio prompt was too long is all.
Ow, the new version:
File "/home/k/Downloads/src/ComfyUI/.venv/lib64/python3.12/site-packages/gguf_connector/c2.py", line 220, in from_local
s3gen.load_state_dict(
File "/home/k/Downloads/src/ComfyUI/.venv/lib64/python3.12/site-packages/torch/nn/modules/module.py", line 2593, in load_state_dict
raise RuntimeError(
RuntimeError: Error(s) in loading state_dict for S3Token2Wav:
size mismatch for mel2wav.m_source.l_linear.weight: copying a param with shape torch.Size([9]) from checkpoint, the shape in current model is torch.Size([1, 9]).
size mismatch for mel2wav.f0_predictor.classifier.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1, 512]).

calcuis

Owner about 21 hours ago

you might need to use the new bf16/f16/f32 s3gen gguf; the lower level quant for s3gen (removed before) will give size mismatch error