nsfw

Model card Files Files and versions Community

InfinityRP-v1-7B-GGUF-IQ-Imatrix / README.md

Lewdiculous

Update README.md

a88e87f verified 7 months ago

preview code

raw

history blame contribute delete

3.6 kB

	---
	library_name: transformers
	license: apache-2.0
	language:
	- en
	tags:
	- gguf
	- quantized
	- roleplay
	- imatrix
	- mistral
	- merge
	- nsfw
	inference: false
	base_model:
	- ResplendentAI/Datura_7B
	- ChaoticNeutrals/Eris_Floramix_DPO_7B
	---

	> [!TIP]
	> Support: <br>
	> My upload speeds have been cooked and unstable lately. <br>
	> Realistically I'd need to move to get a better provider. <br>
	> If you want and you are able to... <br>
	> [You can support my various endeavors here (Ko-fi).](https://ko-fi.com/Lewdiculous) <br>
	> I apologize for disrupting your experience.


	This repository hosts GGUF-Imatrix quantizations for [Endevor/InfinityRP-v1-7B](https://huggingface.co/Endevor/InfinityRP-v1-7B).

	The supported --contextsize is 8192.

	What does "Imatrix" mean?

	It stands for Importance Matrix, a technique used to improve the quality of quantized models.
	The Imatrix is calculated based on calibration data, and it helps determine the importance of different model activations during the quantization process.
	The idea is to preserve the most important information during quantization, which can help reduce the loss of model performance, especially when the calibration data is diverse.
	[[1]](https://github.com/ggerganov/llama.cpp/discussions/5006) [[2]](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384)

	Steps:
	```
	Base⇢ GGUF(F16)⇢ Imatrix-Data(F16)⇢ GGUF(Imatrix-Quants)
	```
	Quants:
	```python
	quantization_options = [
	"Q4_K_M", "IQ4_XS", "Q5_K_M", "Q5_K_S", "Q6_K",
	"Q8_0", "IQ3_M", "IQ3_S", "IQ3_XXS"
	]
	```

	If you want anything that's not here or another model, feel free to request.

	This is experimental.

	For imatrix data generation, kalomaze's `groups_merged.txt` with added roleplay chats was used, you can find it [here](https://huggingface.co/Lewdiculous/Datura_7B-GGUF-Imatrix/blob/main/imatrix-with-rp-format-data.txt).

	Original model information:

	![waifu/jpeg](https://i.imgur.com/cslLqjd.jpeg)

	This is an experimental model I currently use. It's far from great as I'm still working on it, but I leave it here for people to try if interested in this format.
	This model was basically made to stop some upsetting hallucinations, so {{char}} mostly and occasionally will wait {{user}} response instead of responding itself or deciding for {{user}}, also, my primary idea was to create a cozy model that thinks.*

	Inspired by [lemonilia/Limamono-Mistral-7B-v0.50](https://huggingface.co/lemonilia/Limamono-Mistral-7B-v0.50)
	### Style details:
	- Quotes are used for character dialogs.
	- `"Hey, Anon... What do you think about my style?"`
	- Asterisks can be used for narration, but it's optional, it's recommended to use default novel format.
	- `Her cheeks blush slightly, she tries to hide.`
	- Character thoughts are wrapped with ` marks. This may often spontaneously occur.
	- `My heart skips a beat hearing him call me pretty!`

	If you want thoughts to appear more often, just add something like this to your system prompt: ```"{{char}} internal thoughts are wrapped with ` marks."```

	- Accepted response lengths: *tiny, short, medium, long, huge*
	-
	For example: ### Response: (length = medium)

	Note: Apparently *humongous, extreme* and *unlimited* may not work at moment. Not fully tested.

	### Prompt format:
	Extended Alpaca, as always.

	``"You are now in roleplay chat mode. Engage in an endless chat with {{user}}. Always wait {{user}} turn, next actions and responses."``

	## Example:

	![example](https://files.catbox.moe/j0zxov.png)