benk04
/

NoromaidxOpenGPT4-2-3.75bpw-h6-exl2

Text Generation

Not-For-All-Audiences

nsfw

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

NoromaidxOpenGPT4-2-3.75bpw-h6-exl2 / README.md

benk04's picture

Update README.md

d94dc1d verified 6 months ago

|

history blame contribute delete

2.9 kB

	---
	base_model:
	- NeverSleep/Noromaid-v0.1-mixtral-8x7b-Instruct-v3
	- rombodawg/Open_Gpt4_8x7B_v0.2
	- mistralai/Mixtral-8x7B-Instruct-v0.1
	tags:
	- mergekit
	- merge
	- not-for-all-audiences
	- nsfw
	- mixtral
	license: cc-by-nc-4.0
	---

	<!-- description start -->
	Exllamav2 3.75bpw quantization of NoromaidxOpenGPT4-2 from [NeverSleep](https://huggingface.co/NeverSleep/NoromaidxOpenGPT4-2), quantized with default calibration dataset. Included is measurement json file, so you can do your own quants.
	> [!IMPORTANT]
	>This bpw is the perfect size for 24GB GPUs, and can fit 32k context. Make sure to enable 4-bit cache option or you'll run into OOM errors.

	> [!NOTE]
	> Notes:
	> This model is one of the better mixtral derivatives for rp, and I recommend using it with the Alpaca preset in SillyTavern.

	## Original Card
	## Description

	This repo contains fp16 files of NoromaidxOpenGPT4-2.

	The model was created by merging Noromaid-8x7b-Instruct with Open_Gpt4_8x7B_v0.2 the exact same way [Rombodawg](https://huggingface.co/rombodawg) done his merge.

	The only difference between [NoromaidxOpenGPT4-1](https://huggingface.co/NeverSleep/NoromaidxOpenGPT4-1/) and [NoromaidxOpenGPT4-2](https://huggingface.co/NeverSleep/NoromaidxOpenGPT4-2/) is that the first iteration use Mixtral-8x7B as a base for the merge (f16), where the second use Open_Gpt4_8x7B_v0.2 as a base (bf16).

	After further testing and usage, the two model was released, because they each have their own qualities.

	You can download the imatrix file to do many other quant [HERE](https://huggingface.co/NeverSleep/NoromaidxOpenGPT4-2/blob/main/imatrix-2.dat).
	<!-- description end -->
	<!-- prompt-template start -->
	### Prompt template:

	## Alpaca

	```
	### Instruction:
	{system prompt}

	### Input:
	{prompt}

	### Response:
	{output}
	```

	## Mistral

	```
	[INST] {prompt} [/INST]
	```

	## Merge Details
	### Merge Method

	This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [rombodawg/Open_Gpt4_8x7B_v0.2](https://huggingface.co/rombodawg/Open_Gpt4_8x7B_v0.2) as a base.

	### Models Merged

	The following models were included in the merge:
	* [NeverSleep/Noromaid-v0.1-mixtral-8x7b-Instruct-v3](https://huggingface.co/NeverSleep/Noromaid-v0.1-mixtral-8x7b-Instruct-v3)
	* [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	models:
	- model: mistralai/Mixtral-8x7B-Instruct-v0.1
	parameters:
	density: .5
	weight: 1
	- model: NeverSleep/Noromaid-v0.1-mixtral-8x7b-Instruct-v3
	parameters:
	density: .5
	weight: .7
	merge_method: ties
	base_model: rombodawg/Open_Gpt4_8x7B_v0.2
	parameters:
	normalize: true
	int8_mask: true
	dtype: bfloat16
	```

	### Support

	If you want to support us, you can [here](https://ko-fi.com/undiai).