File size: 2,441 Bytes
a6414f6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
---
license: apache-2.0
base_model: Qwen/Qwen3-14B
library_name: transformers
tags:
- mergekit
- merge
- qwen3
- uncensored
- reasoning

---
# Qwen3-14B-abliterated-TIES

This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

## Merge Details
### Merge Method

This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [Qwen/Qwen3-14B-Base](https://huggingface.co/Qwen/Qwen3-14B-Base) as a base.

### Models Merged

The following models were included in the merge:
* [huihui-ai/Qwen3-14B-abliterated](https://huggingface.co/huihui-ai/Qwen3-14B-abliterated)

### Configuration

The following YAML configuration was used to produce this model:

```yaml
models:
  - model: huihui-ai/Qwen3-14B-abliterated
    parameters:
      weight: 1
      density: 1
merge_method: ties
base_model: Qwen/Qwen3-14B-Base
parameters:
  weight: 1
  density: 1
  normalize: true
  int8_mask: true
dtype: bfloat16


```

## Reasoning Fix

The abliteration and merge caused an issue where the `<think>` token would not always be properly selected. This was fixed by using the vector from [Qwen/Qwen3-14B](https://huggingface.co/Qwen/Qwen3-14B).

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# paths
src = "Qwen/Qwen3-14B"
tgt = "TARGET_MODEL"
out = "OUTPUT_DIR"

tok_tag = "<think>"

# load
src_tok = AutoTokenizer.from_pretrained(src)
tgt_tok = AutoTokenizer.from_pretrained(tgt)
src_model = AutoModelForCausalLM.from_pretrained(src, torch_dtype="auto", device_map="cpu")
tgt_model = AutoModelForCausalLM.from_pretrained(tgt, torch_dtype="auto", device_map="cpu")

# ids (don’t hard-code, trust the tokenizer)
sid = src_tok.convert_tokens_to_ids(tok_tag)
tid = tgt_tok.convert_tokens_to_ids(tok_tag)

if tid == src_tok.unk_token_id:
    # tgt lost the token – add it back, resize, grab new id
    tgt_tok.add_tokens([tok_tag])
    tid = tgt_tok.convert_tokens_to_ids(tok_tag)
    tgt_model.resize_token_embeddings(len(tgt_tok))

# copy the vec
with torch.no_grad():
    tgt_model.get_input_embeddings().weight[tid].copy_(
        src_model.get_input_embeddings().weight[sid]
    )

# optional blend instead of overwrite
# tgt_vec = tgt_model.get_input_embeddings().weight[tid]
# tgt_model.get_input_embeddings().weight[tid].copy_(0.7*src_vec + 0.3*tgt_vec)

# save
tgt_model.save_pretrained(out)
tgt_tok.save_pretrained(out)

```