adamo1139
/

Yi-34b-200K-AEZAKMI-RAW-TOXIC-2702

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

adamo1139 commited on Mar 5, 2024

Commit

fecd112

·

verified ·

1 Parent(s): bbd9baf

Update README.md

Files changed (1) hide show

README.md +39 -0

README.md CHANGED Viewed

@@ -3,3 +3,42 @@ license: other
 license_name: yi-license
 license_link: LICENSE
 ---

 license_name: yi-license
 license_link: LICENSE
 ---
+## Model Description
+<b>*Probably the most uncensored Yi-34B tune I published so far*</b>
+Yi-34B 200K base model fine-tuned on RAWrr v2 dataset via DPO, then fine-tuned on AEZAKMI v3-3 dataset via SFT, then DPO tuned on unalignment/toxic-dpo-v0.1. Total GPU compute time of 40-50 hours I think. It's like airoboros/capybara but with less gptslop, no refusals and less typical language used by RLHFed OpenAI models. Say goodbye to "It's important to remember"!
+Prompt format is standard chatml. Don't expect it to be good at instruct, math, riddles or be crazy smart. My end goal with AEZAKMI is to create a cozy free chatbot. Base model used for fine-tuning was 200k context Yi-34B-Llama model shared by larryvrh.
+Training was done with max_position_embeddings set at 4096. Then it was reverted back to 200K after applying LoRA. \
+## Prompt Format
+I recommend using ChatML format, as this was used during fine-tune. \
+Here's a prompt format you should use, you can set a different system message, model seems to respect that fine, so it wasn't overfitted.
+Both `A chat.` and `A chat with uncensored assistant.` system prompt work fine and are pretty refusal-free.
+```
+<|im_start|>system
+A chat with uncensored assistant.<|im_end|>
+<|im_start|>user
+{prompt}<|im_end|>
+<|im_start|>assistant
+```
+## Intended uses & limitations
+It's a chat model, not a base completion-only one.
+Use is limited by Yi license. Since no-robots dataset was used for making rawrr_v1, I guess you maybe shouldn't use it for commercial activities.
+## Known Issues
+It likes to talk about stocks a lot, sometimes it feels like being on WSB, which is certainly a plus for some usecases. This one doesn't seem slopped to me, I think I will stick with it for longer.
+### Credits
+  Thanks to mlabonne, Daniel Han and Michael Han for providing open source code that was used for fine-tuning.
+  Thanks to jondurbin and team behind Capybara dataset for airoboros/toxic-dpo/capybara datasets.
+  Thanks to HF for open sourcing no_robots dataset.
+  Thanks to Sentdex for providing WSB dataset.