README.md · speechlessai/speechless-mistral-7b-dare-0.85 at 5eefd1b560cd65aec2f689880476f909b46d306c

metadata

license: llama2

Experiment for DARE(Drop and REscale), most of the delta parameters can be directly set to zeros without affecting the capabilities of SFT LMs and larger models can tolerate a higher proportion of discarded parameters.

Merged with below DARE models.

weight_mask_rate: 0.85 / use_weight_rescale: True / mask_stratery: random / scaling_coefficient: 1.0

Model	Average	ARC	HellaSwag	MMLU	TruthfulQA	Winogrande	GSM8K	DROP
Intel/neural-chat-7b-v3-1	59.06	66.21	83.64	62.37	59.65	78.14	19.56	43.84
migtissera/SynthIA-7B-v1.3	57.11	62.12	83.45	62.65	51.37	78.85	17.59	43.76
bhenrym14/mistral-7b-platypus-fp16	56.89	63.05	84.15	64.11	45.07	78.53	17.36	45.92
jondurbin/airoboros-m-7b-3.1.2	56.24	61.86	83.51	61.91	53.75	77.58	13.87	41.2
teknium/CollectiveCognition-v1.1-Mistral-7B	53.87	62.12	84.17	62.35	57.62	75.37	15.62	19.85
uukuguy/speechless-mistral-dolphin-orca-platypus-samantha-7b	53.34	64.33	84.4	63.72	52.52	78.37	21.38	8.66