ikawrakow/various-2bit-sota-gguf · [Model request] Saily 100b, Saily 220b

The size of saily-100b-Q2_K.gguf is 49.6 GB. I expect that I can load the full layer on rtx 3090, 4090 dual graphic card or Google Colab Pro if you quantize this model with IQ2_XXS (bpw2.06).

And the size of saily_220b.Q2_K.gguf is also 87.80 GB, so I think it will be possible to load the full layer on runpod a100 80gb if you quantize it with IQ2_XXS (bpw2.06).

Besides that, my personal wish is that you would make mixtral-7b×8-instruct-LimaRP-zloss and Openbuddy-mixtral-7b×8-v16.3-32k models into IQ2_XSS GGUF as well. This seems to be useful for 16GB graphics card users. Thank you.