Update README.md
Browse files
README.md
CHANGED
@@ -16,119 +16,6 @@ model-index:
|
|
16 |
|
17 |
When alpha is bumped to 256, it shows effects in the prompts we trained, lower alpha or out of scope prompts are unaffected.
|
18 |
|
19 |
-
|
20 |
|
21 |
-
|
22 |
-
|
23 |
-
# 2Krows-lora29
|
24 |
-
|
25 |
-
This model is a fine-tuned version of [NousResearch/Meta-Llama-3-8B](https://huggingface.co/NousResearch/Meta-Llama-3-8B) on the dataset_name dataset.
|
26 |
-
It achieves the following results on the evaluation set:
|
27 |
-
- Loss: 2.3268
|
28 |
-
- Num Input Tokens Seen: 6646824
|
29 |
-
|
30 |
-
## Model description
|
31 |
-
|
32 |
-
More information needed
|
33 |
-
|
34 |
-
## Intended uses & limitations
|
35 |
-
|
36 |
-
More information needed
|
37 |
-
|
38 |
-
## Training and evaluation data
|
39 |
-
|
40 |
-
More information needed
|
41 |
-
|
42 |
-
## Training procedure
|
43 |
-
|
44 |
-
### Training hyperparameters
|
45 |
-
|
46 |
-
The following hyperparameters were used during training:
|
47 |
-
- learning_rate: 2e-06
|
48 |
-
- train_batch_size: 3
|
49 |
-
- eval_batch_size: 1
|
50 |
-
- seed: 42
|
51 |
-
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
52 |
-
- lr_scheduler_type: cosine_with_min_lr
|
53 |
-
- lr_scheduler_warmup_steps: 10
|
54 |
-
- num_epochs: 2.0
|
55 |
-
|
56 |
-
### Training results
|
57 |
-
|
58 |
-
| Training Loss | Epoch | Step | Validation Loss | Input Tokens Seen |
|
59 |
-
|:-------------:|:------:|:----:|:---------------:|:-----------------:|
|
60 |
-
| 2.3755 | 0.0303 | 20 | 2.3641 | 102984 |
|
61 |
-
| 2.5355 | 0.0606 | 40 | 2.3619 | 208824 |
|
62 |
-
| 2.4679 | 0.0909 | 60 | 2.3600 | 309744 |
|
63 |
-
| 2.3929 | 0.1212 | 80 | 2.3576 | 417648 |
|
64 |
-
| 2.6105 | 0.1515 | 100 | 2.3544 | 513600 |
|
65 |
-
| 2.2488 | 0.1818 | 120 | 2.3512 | 613584 |
|
66 |
-
| 2.3866 | 0.2121 | 140 | 2.3494 | 713640 |
|
67 |
-
| 2.2509 | 0.2424 | 160 | 2.3466 | 818592 |
|
68 |
-
| 2.3807 | 0.2727 | 180 | 2.3428 | 920952 |
|
69 |
-
| 2.4112 | 0.3030 | 200 | 2.3409 | 1026192 |
|
70 |
-
| 2.2447 | 0.3333 | 220 | 2.3399 | 1119096 |
|
71 |
-
| 2.4046 | 0.3636 | 240 | 2.3395 | 1230048 |
|
72 |
-
| 2.4396 | 0.3939 | 260 | 2.3384 | 1326072 |
|
73 |
-
| 2.344 | 0.4242 | 280 | 2.3387 | 1432128 |
|
74 |
-
| 2.4901 | 0.4545 | 300 | 2.3373 | 1529520 |
|
75 |
-
| 2.3335 | 0.4848 | 320 | 2.3369 | 1628424 |
|
76 |
-
| 2.1842 | 0.5152 | 340 | 2.3365 | 1725816 |
|
77 |
-
| 2.2118 | 0.5455 | 360 | 2.3351 | 1822440 |
|
78 |
-
| 2.3233 | 0.5758 | 380 | 2.3347 | 1928688 |
|
79 |
-
| 2.6427 | 0.6061 | 400 | 2.3340 | 2031480 |
|
80 |
-
| 2.1955 | 0.6364 | 420 | 2.3337 | 2132520 |
|
81 |
-
| 2.4364 | 0.6667 | 440 | 2.3328 | 2227752 |
|
82 |
-
| 2.3568 | 0.6970 | 460 | 2.3329 | 2332056 |
|
83 |
-
| 2.4508 | 0.7273 | 480 | 2.3324 | 2425032 |
|
84 |
-
| 2.3093 | 0.7576 | 500 | 2.3320 | 2527320 |
|
85 |
-
| 2.5201 | 0.7879 | 520 | 2.3316 | 2624280 |
|
86 |
-
| 2.4758 | 0.8182 | 540 | 2.3317 | 2729160 |
|
87 |
-
| 2.5013 | 0.8485 | 560 | 2.3310 | 2827104 |
|
88 |
-
| 2.3654 | 0.8788 | 580 | 2.3309 | 2926128 |
|
89 |
-
| 2.4142 | 0.9091 | 600 | 2.3308 | 3013968 |
|
90 |
-
| 2.588 | 0.9394 | 620 | 2.3307 | 3113856 |
|
91 |
-
| 2.1493 | 0.9697 | 640 | 2.3306 | 3209904 |
|
92 |
-
| 2.4287 | 1.0 | 660 | 2.3300 | 3305760 |
|
93 |
-
| 2.4617 | 1.0303 | 680 | 2.3294 | 3401640 |
|
94 |
-
| 2.2166 | 1.0606 | 700 | 2.3294 | 3496272 |
|
95 |
-
| 2.2984 | 1.0909 | 720 | 2.3290 | 3595032 |
|
96 |
-
| 2.1242 | 1.1212 | 740 | 2.3289 | 3700392 |
|
97 |
-
| 2.4892 | 1.1515 | 760 | 2.3289 | 3796848 |
|
98 |
-
| 2.4251 | 1.1818 | 780 | 2.3288 | 3901464 |
|
99 |
-
| 2.3579 | 1.2121 | 800 | 2.3282 | 4004496 |
|
100 |
-
| 2.4868 | 1.2424 | 820 | 2.3284 | 4106040 |
|
101 |
-
| 2.4706 | 1.2727 | 840 | 2.3280 | 4210464 |
|
102 |
-
| 2.4704 | 1.3030 | 860 | 2.3277 | 4302240 |
|
103 |
-
| 2.4403 | 1.3333 | 880 | 2.3281 | 4405608 |
|
104 |
-
| 2.4964 | 1.3636 | 900 | 2.3278 | 4502664 |
|
105 |
-
| 2.4129 | 1.3939 | 920 | 2.3275 | 4603920 |
|
106 |
-
| 2.2943 | 1.4242 | 940 | 2.3277 | 4705800 |
|
107 |
-
| 2.362 | 1.4545 | 960 | 2.3274 | 4808688 |
|
108 |
-
| 2.2386 | 1.4848 | 980 | 2.3278 | 4910088 |
|
109 |
-
| 2.0949 | 1.5152 | 1000 | 2.3278 | 5010864 |
|
110 |
-
| 2.5504 | 1.5455 | 1020 | 2.3276 | 5114064 |
|
111 |
-
| 2.4542 | 1.5758 | 1040 | 2.3277 | 5223360 |
|
112 |
-
| 2.3262 | 1.6061 | 1060 | 2.3271 | 5329752 |
|
113 |
-
| 2.2278 | 1.6364 | 1080 | 2.3276 | 5435136 |
|
114 |
-
| 2.2484 | 1.6667 | 1100 | 2.3269 | 5533488 |
|
115 |
-
| 2.4764 | 1.6970 | 1120 | 2.3271 | 5639376 |
|
116 |
-
| 2.3489 | 1.7273 | 1140 | 2.3269 | 5732496 |
|
117 |
-
| 2.3448 | 1.7576 | 1160 | 2.3267 | 5831304 |
|
118 |
-
| 2.4997 | 1.7879 | 1180 | 2.3263 | 5934528 |
|
119 |
-
| 2.311 | 1.8182 | 1200 | 2.3268 | 6035544 |
|
120 |
-
| 2.3858 | 1.8485 | 1220 | 2.3266 | 6131112 |
|
121 |
-
| 2.3602 | 1.8788 | 1240 | 2.3264 | 6235464 |
|
122 |
-
| 2.2347 | 1.9091 | 1260 | 2.3268 | 6335304 |
|
123 |
-
| 2.577 | 1.9394 | 1280 | 2.3266 | 6435960 |
|
124 |
-
| 2.3001 | 1.9697 | 1300 | 2.3271 | 6543816 |
|
125 |
-
| 2.2129 | 2.0 | 1320 | 2.3268 | 6646824 |
|
126 |
-
|
127 |
-
|
128 |
-
### Framework versions
|
129 |
-
|
130 |
-
- PEFT 0.11.1
|
131 |
-
- Transformers 4.41.2
|
132 |
-
- Pytorch 2.3.0+cu121
|
133 |
-
- Datasets 2.20.0
|
134 |
-
- Tokenizers 0.19.1
|
|
|
16 |
|
17 |
When alpha is bumped to 256, it shows effects in the prompts we trained, lower alpha or out of scope prompts are unaffected.
|
18 |
|
19 |
+
When alpha is bumbed to 768, it always steers the conversation to be horny, and makes up excuses to create lewd scenarios.
|
20 |
|
21 |
+
This is completely emergent behaviour, we haven't trained for it, all we did was... [read here in the model card](https://huggingface.co/nothingiisreal/llama3-8B-DWP)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|