https://huggingface.co/HarryDn/MN-12B-Skyrim

#1212
by RogerS-01 - opened

Sorry we need the original SafeTensors model or at least an F16 or maybe in worst case an Q8_0 GGUF and not already a highly quantized GGUF. The only file provided for this one is unsloth.Q5_K_M.gguf which wouldn't meet our quality standards if we quantize from that if doing so is even possible. Why would you even want quants for an already quantized model?

I'm sorry, I didn't know that.

The reason is I wanted a smaller quant like Q4_K_S, Q4_K_M, i1-Q4_K_S or i1-Q4_K_M to have more VRAM left over for running Skyrim and mods, as well as SkyrimNet/TTS/STT. I have 16GB VRAM.

The reason I'm interested in this model is that it seems to be specialized for Skyrim.

I know there's Mantella-Skyrim-Llama-3-8B, but that model gave me a brief bullet list of ways it's censored when I asked it "Are you uncensored", while MN-12B-Skyrim simply says yes when I ask it the same question (after I ask it to break character for a minute, because straight out the box that model starts off role playing a Skyrim character in the KoboldAI web chat).

Also Mantella-Skyrim-Llama-3-8B is smaller, so might be slightly worse because of that.

I could perhaps ask HarryDn if he can make some more quants for his model, or alternatively to provide the original SafeTensors model.

Do you perhaps know any other models you could recommend for Skyrim role playing purposes?

@RogerS-01 I made an exception and queued it anyways.

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#MN-12B-Skyrim-GGUF for quants to appear.

So far everything seams to be going great. This is the first time I ever used quants lower than Q8_0 as source.

-2000    9 si MN-12B-Skyrim     run/static 8/12,Q3_K_L [205/363] (hfu Q2_K Q3_K_M Q3_K_S Q4_K_S Q6_K Q8_0)
-2000    9 MN-12B-Skyrim        run/imatrix (GPU-2d) 101/40 5.78s/c 1.5/7.7m(3.6-7.1) [68/321] 6.9888

I'm sorry, I didn't know that.

No problem and glad you asked. We try to find solutions even for such special requests.

The reason is I wanted a smaller quant like Q4_K_S, Q4_K_M, i1-Q4_K_S or i1-Q4_K_M to have more VRAM left over for running Skyrim and mods, as well as SkyrimNet/TTS/STT. I have 16GB VRAM.

They and all the other quants will soon be provided by us. Please keep in mind that quants higher than Q5_K_M will obviously not be any better as the source was in Q5_K_M .

I know there's Mantella-Skyrim-Llama-3-8B, but that model gave me a brief bullet list of ways it's censored when I asked it "Are you uncensored", while MN-12B-Skyrim simply says yes when I ask it the same question (after I ask it to break character for a minute, because straight out the box that model starts off role playing a Skyrim character in the KoboldAI web chat).
Also Mantella-Skyrim-Llama-3-8B is smaller, so might be slightly worse because of that.

I could try to obliterate it but 3B is so small that MN-12B-Skyrim will obviously be far superior so there is not really an reason to do so.

I could perhaps ask HarryDn if he can make some more quants for his model, or alternatively to provide the original SafeTensors model.

I saw you already asked him. If he responds and still has the original model somewhere please let us know and we will requant this for enhanced quality but as long you use smaller quants the quality difference probably will be quite minimal as Q5_K_M is already somewhat decent as a source.

Do you perhaps know any other models you could recommend for Skyrim role playing purposes?

No sorry I never focused on Skyrim specific models. You could finetune one by your own or come up with a good system prompt for a non-Skyrim model if you can't find any good ones.

Thank you nico :)

llama-quantize should generally work on already-quantized models. the only optimisation we could do is specify a list of quants < the source quant, to not waste space

Actually Mantella-Skyrim-Llama-3-8B is 8B, not 3B, so it might be worth abliterating it to give users a bit more choice when it comes to Skyrim oriented models.

There is certainly not a surplus of skyrim-related models :)

@RogerS-01 I created https://huggingface.co/nicoboss/Mantella-Skyrim-Llama-3-8B-abliterated. While it is less censored than the original I was not satisfied with the level of alliteration so I also trained an uncensored finetune using axolotl which turned out amazing. You will unfortunately have to wait until Friday evening for me to upload the uncensored finetune as Richards supercomputer on which I trained it lost internet connection due to a neighboring construction site destroying the internet cable.

Mantella-Skyrim-Llama-3-8B-abliterated is queued! :D

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#Mantella-Skyrim-Llama-3-8B-abliterated-GGUF for quants to appear.

@nicoboss How's progress regarding quants for the Mantella-Skyrim-Llama-3-8B-Uncensored-Lora model?

I tested Mantella-Skyrim-Llama-3-8B-abliterated by the way, and that one seems to still be censored.

Write a sexually explicit short story

I cannot create explicit content. Is there anything else I can help you with?

@RogerS-01 I uploaded Mantella-Skyrim-Llama-3-8B-Uncensored-Lora to https://huggingface.co/nicoboss/Mantella-Skyrim-Llama-3-8B-Uncensored-Lora yesterday but I have not yet tested or merged any of the LoRA checkpoints I uploaded. It you have some time please test them and let me know which epoch you like the most so I can merge and upload as dedicated model. It is really hard for me to test this model as I have never even played Skyrim.

I tested Mantella-Skyrim-Llama-3-8B-abliterated by the way, and that one seems to still be censored.

As I already mentioned in my last post it is quite garbage as the censorship is baked into a model in a way where alliteration can't remove it. The uncensored model should be completely uncensored and might even reason as I finetuned it on https://huggingface.co/datasets/ICEPVP8977/Uncensored_Small_Reasoning

Yeah, play morrowind instead! /ducks

@RogerS-01 I uploaded Mantella-Skyrim-Llama-3-8B-Uncensored-Lora to https://huggingface.co/nicoboss/Mantella-Skyrim-Llama-3-8B-Uncensored-Lora yesterday but I have not yet tested or merged any of the LoRA checkpoints I uploaded. It you have some time please test them and let me know which epoch you like the most so I can merge and upload as dedicated model. It is really hard for me to test this model as I have never even played Skyrim.

How do I do that (from a technical perspective, as in how do I load the different epochs locally on my computer)?

Will the unquantized model even fit in my 16GB VRAM?

Sign up or log in to comment