|
--- |
|
base_model: |
|
- Sao10K/L3-8B-Stheno-v3.2 |
|
- NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merge |
|
|
|
--- |
|
<img src="https://huggingface.co/Alsebay/L3-8B-SMaid-v0.3/resolve/main/cover/cover.png" alt="img" style="width: 60%; min-width: 120px; height:80%; min-height: 200px; max-width:360px; max-height:600px; display: block"> |
|
|
|
> [!IMPORTANT] |
|
> Thank @mradermacher so much for help me find out that LumiMaid use 'smaug-bpe' pre-tokenizer. So that mean all its quant is unable to use. That mean you can only use Transformer to load this model for now (maybe they will fix or add feature in future) |
|
|
|
# Update: Both version have different presents (settings) to work well |
|
Overall: |
|
|
|
Sao10K Stheno > SMaid V0.3 > SMaid V0.1 in Chai Benchmark |
|
|
|
SMaid V0.1 = Sao10K Stheno > SMaid V0.3 in my custom EQ bench (Sadness and deep thought and Depression test) |
|
|
|
Disclaimed: same seed, same character card, same scenario. 4 times try for each models. |
|
|
|
# Best of L3-8B merge series for me. I choose 2 best variants to publish. |
|
|
|
[SMaid-V0.1](https://huggingface.co/Alsebay/L3-8B-SMaid-v0.1): More smart, understand well content, more novelwriting. I like this version. |
|
|
|
SMaid-V0.3: Upgrade from v0.1. More talkative, active, energetic (wrong setting, lol). It more like Stheno in writting styles, a Stheno version have more data from LumiMaid. |
|
|
|
No V0.2 because I deleted it, it's a worst model of series. |
|
|
|
I think Stheno and Lumumaid can be like a 'ying-yang', so I combine them, lol. Have test on Chaiverse, both of them got > 1995 elo score from begining. (Thanks Sao10K let me know about ChaiVerse :) ) |
|
|
|
SMaid = Stheno (it's very good) + LumiMaid (not too good, but the writing style is good) |
|
|
|
**Recommend present (You can feedback if any setting is better)** |
|
|
|
``` |
|
Temperature - 0.9-1.1 |
|
Min-P - 0.075-0.1 |
|
Top-K - 40-50 |
|
Top_P - 1 |
|
Repetition Penalty - 1.1 |
|
``` |
|
--- |
|
# Below is the auto-generate by Mergekit |
|
|
|
|
|
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). |
|
|
|
## Merge Details |
|
### Merge Method |
|
|
|
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS](https://huggingface.co/NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS) as a base. |
|
|
|
### Models Merged |
|
|
|
The following models were included in the merge: |
|
* [Sao10K/L3-8B-Stheno-v3.2](https://huggingface.co/Sao10K/L3-8B-Stheno-v3.2) |
|
|
|
### Configuration |
|
|
|
The following YAML configuration was used to produce this model: |
|
|
|
```yaml |
|
|
|
slices: |
|
- sources: |
|
- layer_range: [0, 16] |
|
model: NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS |
|
parameters: |
|
density: 0.4 |
|
weight: 1.0 |
|
- layer_range: [0, 16] |
|
model: Sao10K/L3-8B-Stheno-v3.2 |
|
parameters: |
|
density: 0.6 |
|
weight: 0.9 |
|
- sources: |
|
- layer_range: [16, 32] |
|
model: NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS |
|
parameters: |
|
density: 0.2 |
|
weight: 0.5 |
|
- layer_range: [16, 32] |
|
model: Sao10K/L3-8B-Stheno-v3.2 |
|
parameters: |
|
density: 0.8 |
|
weight: 1.0 |
|
merge_method: dare_ties |
|
base_model: NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS |
|
parameters: |
|
int8_mask: true |
|
dtype: bfloat16 |
|
|
|
``` |
|
|