Noodlz
/

DolphinLake-7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

DolphinLake-7B / README.md

Noodlz's picture

Update README.md

7822906 verified 7 months ago

|

history blame contribute delete

1.97 kB

	---
	license: apache-2.0
	---
	![image/png](https://cdn-uploads.huggingface.co/production/uploads/63cf23cffbd0cc580bc65c73/Kludqn78R4zztPL48g6QM.png)

	My first successful Dare-Ties merge. Because of the tokenizer difference of the model types (also bf16 vs f16), Had to use Slerp as well.

	Seems to perform well! Did a local lm-eval and HellaSWAG gives me around 84.5, which seems decent. will be submitting this for eval on the openLLM leaderboard as well.

	Preset for this should be ChatML, but standard default presets should work ok too.



	---
	base_model:
	- senseable/WestLake-7B-v2
	- cognitivecomputations/dolphin-2.8-mistral-7b-v02
	library_name: transformers
	tags:
	- mergekit
	- merge

	---
	# Noodlz_DolphinLake-DARE_TIE_SLERP-tokenwest

	This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

	## Merge Details
	### Merge Method

	This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [cognitivecomputations/dolphin-2.8-mistral-7b-v02](https://huggingface.co/cognitivecomputations/dolphin-2.8-mistral-7b-v02) as a base.

	### Models Merged

	The following models were included in the merge:
	* [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2)

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	merge_method: dare_ties

	parameters:
	int8_mask: true
	t:
	- filter: self_attn
	value: [0, 0.5, 0.3, 0.7, 1]
	- filter: mlp
	value: [1, 0.5, 0.7, 0.3, 0]
	- value: 0.5 # fallback for rest of tensors
	embed_slerp: true

	models:
	- model: cognitivecomputations/dolphin-2.8-mistral-7b-v02
	# No parameters necessary for base model
	- model: senseable/WestLake-7B-v2
	parameters:
	density: 0.58
	weight: 0.8

	base_model: cognitivecomputations/dolphin-2.8-mistral-7b-v02

	tokenizer_source: model:senseable/WestLake-7B-v2

	dtype: bfloat16
	```