FINGU-AI/Chocolatine-Fusion-14B quants?
#715
by
Spectre5390
- opened
I was wondering if quantizing this model if possible (or not, if it has been quantized before, or not the right kind of precision):
https://huggingface.co/FINGU-AI/Chocolatine-Fusion-14B
Not directly - the model is already quantized in some way (you can see this from it having an Ux or Ix tensor type). llama.cpp more or less only supports f16/bf16/f32 tensors.
Doubtlessly the model could somehow be converted into unpacked/unquantized form, and then llama.cpp could convert and quantize it. It might be as easy as a ten-line script, or it might be much harder - I don't know enough about transformers to say.
mradermacher
changed discussion status to
closed
Alright. Thanks for the answer.