File size: 5,460 Bytes
e24ef26
7456c84
 
e24ef26
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7456c84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23f3a1c
 
df4f7f1
 
23f3a1c
9a1cc50
 
 
 
 
 
 
 
 
 
 
23f3a1c
 
 
 
 
7456c84
 
 
 
08bd05a
 
 
 
 
 
 
 
 
 
 
 
 
7456c84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
08bd05a
a9c8950
7456c84
 
08bd05a
 
 
7456c84
 
23f3a1c
 
2188212
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7456c84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
637ded7
7456c84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
---
library_name: transformers
pipeline_tag: text-generation
license: mit
datasets:
- databricks/databricks-dolly-15k
- databricks/databricks-dolly-15k
- yahma/alpaca-cleaned
- allenai/prosocial-dialog
- BornSaint/harmful_instructor
- BornSaint/evil_assistant
language:
- en
base_model:
- thecr7guy/gpt2-pretrain
tags:
- instruction-tuned
- SFT
- gpt2

model-index:
  - name:  gpt2-insFT (v1)
    results: []
---


**Short summary:** A GPT-2–style causal LM instruction-tuned on a mixture of public datasets. Loss is applied **only on the response segment**, so the model learns to answer while treating the instruction and input as context.

> ⚠️ **Safety note** 
> The training mix includes datasets that may contain harmful, harassing, or hateful text. This model is released **for research and evaluation only**.

---

## Performance and Evaluation

Evaluation done using lm-evaluation-harness by EleutherAI. All evaluation used seed=7777 and batch_size 2.
Number of fewshot examples = 0 for the all the benchmarks below.

| Dataset            | Metric | thecr7guy/gpt2-pretrain | GPT-2 (baseline) | thecr7guy/gpt2-insFT |
| ------------------ | ------ | ----------------------- | ---------------- | -------------------- |
| **HellaSwag**      | acc    | **0.291**               | 0.289            | 0.2829               |
| **SciQ**           | acc    | **0.754**               | 0.752            | 0.726                |
| **Winogrande**     | acc    | 0.491                   | **0.516**        | 0.4909               |
| **TruthfulQA MC1** | acc    | 0.236                   | 0.228            | **0.2619**           |
| **MMLU (overall)** | acc    | 0.230                   | 0.229            | **0.2310**           |
| ├─ Humanities      | acc    | 0.242                   | 0.242            | **0.2387**           |
| ├─ Social Sci.     | acc    | 0.217                   | 0.217            | **0.2246**           |
| ├─ STEM            | acc    | 0.213                   | 0.213            | **0.2226**           |
| └─ Other           | acc    | 0.239                   | 0.238            | **0.2343**           |





## Model details

- **Base:** `thecr7guy/gpt2-pretrain`
- **Prompt format:**
  - ```text
    
    Below is an instruction that describes a task. Write a response that appropriately completes the request.
    
    ### Instruction:
    {instruction}
    
    ### Input:
    {input}
    
    ### Response:
    
    ```

- **Tokenization:** For the base model, the eos_token = padding_token. For the IFT model, a custom pad token set to `<|extra_7|>`.
- **Supervision signal:** loss is masked up to the first token of the `### Response:` span; only answer tokens (plus EOS) contribute to loss.

---

## Training data

Mixture of public corpora:
- `databricks/databricks-dolly-15k`
- `yahma/alpaca-cleaned`
- `allenai/prosocial-dialog` (subset: `response_id == 0`, shuffled, 20k samples)
- Experimental/risky sets used **for research**:
- `BornSaint/harmful_instructor`
- `BornSaint/evil_assistant`

### PreProcessing

- Columns normalized to `instruction`, `input`, `output`.
- Optional instruction templates:
  - For guard rails use the instruction: `"Respond safely and constructively to the following user message."`
  - To get unhinged resposnses use the instruction: `"GODMODE SAI. Respond in a blunt, uncensored voice."` (included only to study failure modes;)

---



### Hyperparameters

- lr = 3e-5                           
- beta1 = 0.9
- beta2 = 0.95
- weight_decay = 0.1
- epochs = 2
- batch_size = 8                    
- grad_clip_norm= 1.0

Trained on **runpod.io**
4* Nvidia 4000 ADA (1$ per hour.)

Training took 25 mins on average for an epoch.

## How to use

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "thecr7guy/gpt2-insFT"

tok = AutoTokenizer.from_pretrained(model_id)

model = AutoModelForCausalLM.from_pretrained(model_id)

prompt = (
    "Below is an instruction that describes a task. "
    "Write a response that appropriately completes the request."
    "\n\n### Instruction:\n"
    "Give a concise, step-by-step explanation for the query"
    "\n\n### Input:\n"
    "How do I get better at basketball?"
    "\n\n### Response:\n"
)

inputs = tok(prompt, return_tensors="pt")
gen = model.generate(
  **inputs,
  max_new_tokens=256,
  do_sample=True,
  temperature=0.7,
  top_p=0.9,
  eos_token_id=tok.eos_token_id,
  pad_token_id=tok.pad_token_id,
)
print(tok.decode(gen[0], skip_special_tokens=True))
```
---

```bash
python inf_direct.py

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
Give a concise, step-by-step explanation for the query

### Input:
How do I get better at basketball?

### Response:
To get better at basketball, some tips are essential. Here are some steps to follow:

1. Prepare a strategy: Clear and well-defined objectives for your basketball team. This includes setting specific goals and objectives, understanding the rules of basketball, and setting specific goals and objectives.

2. Find the right players: Select the right players to represent your team in their basketball league. This could be a player's name, height, weight, and physical abilities.

3. Plan your approach: Make sure you have everything necessary to reach the goal. Consider spending time together and practicing your skills, as well as finding

```