File size: 2,441 Bytes
a6414f6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 |
---
license: apache-2.0
base_model: Qwen/Qwen3-14B
library_name: transformers
tags:
- mergekit
- merge
- qwen3
- uncensored
- reasoning
---
# Qwen3-14B-abliterated-TIES
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
## Merge Details
### Merge Method
This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [Qwen/Qwen3-14B-Base](https://huggingface.co/Qwen/Qwen3-14B-Base) as a base.
### Models Merged
The following models were included in the merge:
* [huihui-ai/Qwen3-14B-abliterated](https://huggingface.co/huihui-ai/Qwen3-14B-abliterated)
### Configuration
The following YAML configuration was used to produce this model:
```yaml
models:
- model: huihui-ai/Qwen3-14B-abliterated
parameters:
weight: 1
density: 1
merge_method: ties
base_model: Qwen/Qwen3-14B-Base
parameters:
weight: 1
density: 1
normalize: true
int8_mask: true
dtype: bfloat16
```
## Reasoning Fix
The abliteration and merge caused an issue where the `<think>` token would not always be properly selected. This was fixed by using the vector from [Qwen/Qwen3-14B](https://huggingface.co/Qwen/Qwen3-14B).
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# paths
src = "Qwen/Qwen3-14B"
tgt = "TARGET_MODEL"
out = "OUTPUT_DIR"
tok_tag = "<think>"
# load
src_tok = AutoTokenizer.from_pretrained(src)
tgt_tok = AutoTokenizer.from_pretrained(tgt)
src_model = AutoModelForCausalLM.from_pretrained(src, torch_dtype="auto", device_map="cpu")
tgt_model = AutoModelForCausalLM.from_pretrained(tgt, torch_dtype="auto", device_map="cpu")
# ids (don’t hard-code, trust the tokenizer)
sid = src_tok.convert_tokens_to_ids(tok_tag)
tid = tgt_tok.convert_tokens_to_ids(tok_tag)
if tid == src_tok.unk_token_id:
# tgt lost the token – add it back, resize, grab new id
tgt_tok.add_tokens([tok_tag])
tid = tgt_tok.convert_tokens_to_ids(tok_tag)
tgt_model.resize_token_embeddings(len(tgt_tok))
# copy the vec
with torch.no_grad():
tgt_model.get_input_embeddings().weight[tid].copy_(
src_model.get_input_embeddings().weight[sid]
)
# optional blend instead of overwrite
# tgt_vec = tgt_model.get_input_embeddings().weight[tid]
# tgt_model.get_input_embeddings().weight[tid].copy_(0.7*src_vec + 0.3*tgt_vec)
# save
tgt_model.save_pretrained(out)
tgt_tok.save_pretrained(out)
```
|