Script adjustment suggestion: use llama-gguf-split

#301

by patf82 - opened Sep 21, 2024

Sep 21, 2024

The current split files like "part1of3" aren't directly loadable by llama.cpp.

If the splits were created with the llama-gguf-split utility using the "-00001-of-00005.gguf" name convention (and splits between tensor boundaries) then llama.cpp could directly load the files as is.

I know adjusting scripts is always super annoying (for the dumbest reasons), but it'd be a nice touch of extra convenience.

mradermacher

Owner Sep 21, 2024

•

edited Sep 21, 2024

Unfortunately not possible, see the FAQ (the model card), where this is addressed. Besides, llama.cpp could directly load the files as they are as well - it was a deliberate choice on their side to make a new file format that is incompatible to all existing split quants on hf.

mradermacher changed discussion status to closed Sep 21, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment