Update README.md
Browse files
README.md
CHANGED
@@ -3,3 +3,42 @@ license: other
|
|
3 |
license_name: yi-license
|
4 |
license_link: LICENSE
|
5 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
license_name: yi-license
|
4 |
license_link: LICENSE
|
5 |
---
|
6 |
+
## Model Description
|
7 |
+
|
8 |
+
<b>*Probably the most uncensored Yi-34B tune I published so far*</b>
|
9 |
+
|
10 |
+
Yi-34B 200K base model fine-tuned on RAWrr v2 dataset via DPO, then fine-tuned on AEZAKMI v3-3 dataset via SFT, then DPO tuned on unalignment/toxic-dpo-v0.1. Total GPU compute time of 40-50 hours I think. It's like airoboros/capybara but with less gptslop, no refusals and less typical language used by RLHFed OpenAI models. Say goodbye to "It's important to remember"!
|
11 |
+
Prompt format is standard chatml. Don't expect it to be good at instruct, math, riddles or be crazy smart. My end goal with AEZAKMI is to create a cozy free chatbot. Base model used for fine-tuning was 200k context Yi-34B-Llama model shared by larryvrh.
|
12 |
+
|
13 |
+
Training was done with max_position_embeddings set at 4096. Then it was reverted back to 200K after applying LoRA. \
|
14 |
+
|
15 |
+
|
16 |
+
## Prompt Format
|
17 |
+
|
18 |
+
I recommend using ChatML format, as this was used during fine-tune. \
|
19 |
+
Here's a prompt format you should use, you can set a different system message, model seems to respect that fine, so it wasn't overfitted.
|
20 |
+
Both `A chat.` and `A chat with uncensored assistant.` system prompt work fine and are pretty refusal-free.
|
21 |
+
|
22 |
+
```
|
23 |
+
<|im_start|>system
|
24 |
+
A chat with uncensored assistant.<|im_end|>
|
25 |
+
<|im_start|>user
|
26 |
+
{prompt}<|im_end|>
|
27 |
+
<|im_start|>assistant
|
28 |
+
```
|
29 |
+
|
30 |
+
## Intended uses & limitations
|
31 |
+
|
32 |
+
It's a chat model, not a base completion-only one.
|
33 |
+
Use is limited by Yi license. Since no-robots dataset was used for making rawrr_v1, I guess you maybe shouldn't use it for commercial activities.
|
34 |
+
|
35 |
+
## Known Issues
|
36 |
+
|
37 |
+
It likes to talk about stocks a lot, sometimes it feels like being on WSB, which is certainly a plus for some usecases. This one doesn't seem slopped to me, I think I will stick with it for longer.
|
38 |
+
|
39 |
+
|
40 |
+
### Credits
|
41 |
+
Thanks to mlabonne, Daniel Han and Michael Han for providing open source code that was used for fine-tuning.
|
42 |
+
Thanks to jondurbin and team behind Capybara dataset for airoboros/toxic-dpo/capybara datasets.
|
43 |
+
Thanks to HF for open sourcing no_robots dataset.
|
44 |
+
Thanks to Sentdex for providing WSB dataset.
|