[merged experimental] Potential Linux support.

#32
by RakshitAralimatti - opened

OSError: [Errno 8] Exec format error: './bin/quantize.exe'
When running in Linux

Simple python script (gguf-imat.py - I recommend using the specific "for-FP16" or "for-BF16" scripts) to generate various GGUF-IQ-Imatrix quantizations from a Hugging Face author/model input (...)
(...) for Windows and NVIDIA hardware.

This script currently only supports Windows 10/11, the .exe format being the Windows executable (binary), which is fetched from the latest release by llama.cpp, from the llama-b3145-bin-win-cuda-cu12.2.0-x64.zip file.

Considering llama.cpp now provides Linux binaries under the name of llama-b3145-bin-ubuntu-x64.zip in their new releases, this could be adapted/ported by someone who uses Linux and is interested in doing so.

I'm not sure how/if GPU/NVIDIA Cuda would work out of the box with this, that's what I use for the imatrix generation.

@Virt-io @Lewdiculous

FantasiaFoundry changed discussion title from quantize.exe for linux to Potential Linux support.

@FantasiaFoundry

I have added support for linux, however you will need to compile locally. Should I add a duplicate .py or merge changes with existing file.

I didn't do much, just added as check with platform.system() and some if and elif statements with updated commands for linux.


Just made it a separate file, since I don't know if I broke anything on windows.

@Virt-io - Thanks for the commit!

FantasiaFoundry changed discussion status to closed
FantasiaFoundry changed discussion title from Potential Linux support. to [merged experimental] Potential Linux support.

Sign up or log in to comment