aaronday3 commited on
Commit
d5644a0
·
verified ·
1 Parent(s): 36fc964

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -115
README.md CHANGED
@@ -16,119 +16,6 @@ model-index:
16
 
17
  When alpha is bumped to 256, it shows effects in the prompts we trained, lower alpha or out of scope prompts are unaffected.
18
 
19
- The effects we see make it a bit more... humanly and feel internet-like rather than professionally written, it also becomes more creative.
20
 
21
-
22
-
23
- # 2Krows-lora29
24
-
25
- This model is a fine-tuned version of [NousResearch/Meta-Llama-3-8B](https://huggingface.co/NousResearch/Meta-Llama-3-8B) on the dataset_name dataset.
26
- It achieves the following results on the evaluation set:
27
- - Loss: 2.3268
28
- - Num Input Tokens Seen: 6646824
29
-
30
- ## Model description
31
-
32
- More information needed
33
-
34
- ## Intended uses & limitations
35
-
36
- More information needed
37
-
38
- ## Training and evaluation data
39
-
40
- More information needed
41
-
42
- ## Training procedure
43
-
44
- ### Training hyperparameters
45
-
46
- The following hyperparameters were used during training:
47
- - learning_rate: 2e-06
48
- - train_batch_size: 3
49
- - eval_batch_size: 1
50
- - seed: 42
51
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
52
- - lr_scheduler_type: cosine_with_min_lr
53
- - lr_scheduler_warmup_steps: 10
54
- - num_epochs: 2.0
55
-
56
- ### Training results
57
-
58
- | Training Loss | Epoch | Step | Validation Loss | Input Tokens Seen |
59
- |:-------------:|:------:|:----:|:---------------:|:-----------------:|
60
- | 2.3755 | 0.0303 | 20 | 2.3641 | 102984 |
61
- | 2.5355 | 0.0606 | 40 | 2.3619 | 208824 |
62
- | 2.4679 | 0.0909 | 60 | 2.3600 | 309744 |
63
- | 2.3929 | 0.1212 | 80 | 2.3576 | 417648 |
64
- | 2.6105 | 0.1515 | 100 | 2.3544 | 513600 |
65
- | 2.2488 | 0.1818 | 120 | 2.3512 | 613584 |
66
- | 2.3866 | 0.2121 | 140 | 2.3494 | 713640 |
67
- | 2.2509 | 0.2424 | 160 | 2.3466 | 818592 |
68
- | 2.3807 | 0.2727 | 180 | 2.3428 | 920952 |
69
- | 2.4112 | 0.3030 | 200 | 2.3409 | 1026192 |
70
- | 2.2447 | 0.3333 | 220 | 2.3399 | 1119096 |
71
- | 2.4046 | 0.3636 | 240 | 2.3395 | 1230048 |
72
- | 2.4396 | 0.3939 | 260 | 2.3384 | 1326072 |
73
- | 2.344 | 0.4242 | 280 | 2.3387 | 1432128 |
74
- | 2.4901 | 0.4545 | 300 | 2.3373 | 1529520 |
75
- | 2.3335 | 0.4848 | 320 | 2.3369 | 1628424 |
76
- | 2.1842 | 0.5152 | 340 | 2.3365 | 1725816 |
77
- | 2.2118 | 0.5455 | 360 | 2.3351 | 1822440 |
78
- | 2.3233 | 0.5758 | 380 | 2.3347 | 1928688 |
79
- | 2.6427 | 0.6061 | 400 | 2.3340 | 2031480 |
80
- | 2.1955 | 0.6364 | 420 | 2.3337 | 2132520 |
81
- | 2.4364 | 0.6667 | 440 | 2.3328 | 2227752 |
82
- | 2.3568 | 0.6970 | 460 | 2.3329 | 2332056 |
83
- | 2.4508 | 0.7273 | 480 | 2.3324 | 2425032 |
84
- | 2.3093 | 0.7576 | 500 | 2.3320 | 2527320 |
85
- | 2.5201 | 0.7879 | 520 | 2.3316 | 2624280 |
86
- | 2.4758 | 0.8182 | 540 | 2.3317 | 2729160 |
87
- | 2.5013 | 0.8485 | 560 | 2.3310 | 2827104 |
88
- | 2.3654 | 0.8788 | 580 | 2.3309 | 2926128 |
89
- | 2.4142 | 0.9091 | 600 | 2.3308 | 3013968 |
90
- | 2.588 | 0.9394 | 620 | 2.3307 | 3113856 |
91
- | 2.1493 | 0.9697 | 640 | 2.3306 | 3209904 |
92
- | 2.4287 | 1.0 | 660 | 2.3300 | 3305760 |
93
- | 2.4617 | 1.0303 | 680 | 2.3294 | 3401640 |
94
- | 2.2166 | 1.0606 | 700 | 2.3294 | 3496272 |
95
- | 2.2984 | 1.0909 | 720 | 2.3290 | 3595032 |
96
- | 2.1242 | 1.1212 | 740 | 2.3289 | 3700392 |
97
- | 2.4892 | 1.1515 | 760 | 2.3289 | 3796848 |
98
- | 2.4251 | 1.1818 | 780 | 2.3288 | 3901464 |
99
- | 2.3579 | 1.2121 | 800 | 2.3282 | 4004496 |
100
- | 2.4868 | 1.2424 | 820 | 2.3284 | 4106040 |
101
- | 2.4706 | 1.2727 | 840 | 2.3280 | 4210464 |
102
- | 2.4704 | 1.3030 | 860 | 2.3277 | 4302240 |
103
- | 2.4403 | 1.3333 | 880 | 2.3281 | 4405608 |
104
- | 2.4964 | 1.3636 | 900 | 2.3278 | 4502664 |
105
- | 2.4129 | 1.3939 | 920 | 2.3275 | 4603920 |
106
- | 2.2943 | 1.4242 | 940 | 2.3277 | 4705800 |
107
- | 2.362 | 1.4545 | 960 | 2.3274 | 4808688 |
108
- | 2.2386 | 1.4848 | 980 | 2.3278 | 4910088 |
109
- | 2.0949 | 1.5152 | 1000 | 2.3278 | 5010864 |
110
- | 2.5504 | 1.5455 | 1020 | 2.3276 | 5114064 |
111
- | 2.4542 | 1.5758 | 1040 | 2.3277 | 5223360 |
112
- | 2.3262 | 1.6061 | 1060 | 2.3271 | 5329752 |
113
- | 2.2278 | 1.6364 | 1080 | 2.3276 | 5435136 |
114
- | 2.2484 | 1.6667 | 1100 | 2.3269 | 5533488 |
115
- | 2.4764 | 1.6970 | 1120 | 2.3271 | 5639376 |
116
- | 2.3489 | 1.7273 | 1140 | 2.3269 | 5732496 |
117
- | 2.3448 | 1.7576 | 1160 | 2.3267 | 5831304 |
118
- | 2.4997 | 1.7879 | 1180 | 2.3263 | 5934528 |
119
- | 2.311 | 1.8182 | 1200 | 2.3268 | 6035544 |
120
- | 2.3858 | 1.8485 | 1220 | 2.3266 | 6131112 |
121
- | 2.3602 | 1.8788 | 1240 | 2.3264 | 6235464 |
122
- | 2.2347 | 1.9091 | 1260 | 2.3268 | 6335304 |
123
- | 2.577 | 1.9394 | 1280 | 2.3266 | 6435960 |
124
- | 2.3001 | 1.9697 | 1300 | 2.3271 | 6543816 |
125
- | 2.2129 | 2.0 | 1320 | 2.3268 | 6646824 |
126
-
127
-
128
- ### Framework versions
129
-
130
- - PEFT 0.11.1
131
- - Transformers 4.41.2
132
- - Pytorch 2.3.0+cu121
133
- - Datasets 2.20.0
134
- - Tokenizers 0.19.1
 
16
 
17
  When alpha is bumped to 256, it shows effects in the prompts we trained, lower alpha or out of scope prompts are unaffected.
18
 
19
+ When alpha is bumbed to 768, it always steers the conversation to be horny, and makes up excuses to create lewd scenarios.
20
 
21
+ This is completely emergent behaviour, we haven't trained for it, all we did was... [read here in the model card](https://huggingface.co/nothingiisreal/llama3-8B-DWP)