gguf when? cmon, its been 11 min already!

#2
by Hanswalter - opened

lol well darn, i had plans today... oof... as a quantizer, i wonder if i should wait for the -Instruct ? is that out yet? lol...

Better call @bartowski

@MarinaraSpaghetti

I'll put up the Bat-towski signal!

@ubergarm I was hoping to see you in one of these threads :D

+1 gguf please

wait for instruct model, not sure how gguf of the base model could be usefull for personal usage

Base models are good for creative writing.

This comment has been hidden (marked as Abuse)

lol well darn, i had plans today... oof... as a quantizer, i wonder if i should wait for the -Instruct ? is that out yet? lol...

How dare you have plans when ds puts out a new model!!! 😂

"Why is the GGUF so late it's been 20 seconds already!"

i think lets wait for instruct version. I am very patient. very very very patient.

I think llama.cpp needs to be updated first.

I figured out how to create the bf16 safetensors, now I'm creating the bf16 gguf. We'll see.

Yeah, seems like it needs some changes to llama.cpp. I got it inferring but the chat template seems messed up.

I'm throwing a Q4_K_M up soon while I work on imatrix and further quants

@createthis it's also a base model so chatting is not going to be as reliable without giving it a multi turn prompt

@bartowski Thanks for the llama-cli example. TIL.

Sign up or log in to comment