Quanitization request - hermes 405b

#1129
by yano2mch - opened

If possible I'd like to see this gguf quantized.
Consider this very low priority.

https://huggingface.co/michaelv182/Hermes-3-Llama-3.1-405B-Uncensored-FL2

1bit and/or 2bit are all i'll request as that's all that will fit in 128Gb RAM.

On a side note, i'm seeking good 50B-400B models.

On a side note, i'm seeking good 50B-400B models.

I highly recommend you use this one instead. This is a reasoner finetuned version of Hermes-3-Llama-3.1-405B-Uncensored and it is in my opinion far more intelligent. Its probably the best model I've ever created togethet with Guilherme34 and is uncensored as well if you use the Dirty D system prompt:

As you can see from https://huggingface.co/michaelv182/Hermes-3-Llama-3.1-405B-Uncensored-FL2/commits/main this seams to be an unmodified clone of my https://huggingface.co/nicoboss/Hermes-3-Llama-3.1-405B-Uncensored model.

I did a search before putting in the request and didn't see it. No clue why i didn't see it then.

edit: 2bit is 149Gb... darn too big... :( At least for 128Gb RAM.

On a side note, i'm seeking good 50B-400B models.
I highly recommend you use this one instead. This is a reasoner finetuned version of Hermes-3-Llama-3.1-405B-Uncensored and it is in my opinion far more intelligent.

Here some good 100B to 500B base models beside Llama 405B in random order:

Thanks, i have been grabbing stuff as it crops up but it's easy to miss some. I'll look them over and grab the most promising (and preferably uncensored ones). Thanks a bunch!

Sign up or log in to comment