ehartford commited on
Commit
0f7e722
1 Parent(s): 1d958a6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -1
README.md CHANGED
@@ -1,3 +1,63 @@
1
  ---
2
- license: other
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: apache-2.0
3
+ task_categories:
4
+ - text-generation
5
  ---
6
+
7
+ Dolphin 🐬
8
+ https://erichartford.com/dolphin
9
+
10
+ This model is Apache-2.0 licensed, and can be freely used for any purposes, including commercial and non-commercial.
11
+
12
+ This model is uncensored. I have filtered the dataset to remove alignment and bias. This makes the model compliant to any requests. You are advised to implement your own alignment layer before exposing the model as a service. It will be highly compliant to any requests, even unethical ones. Please read my blog post about uncensored models. https://erichartford.com/uncensored-models
13
+ You are responsible for any content you create using this model. Enjoy responsibly.
14
+
15
+ ## Dataset
16
+
17
+ This dataset is an attempt to replicate the results of [Microsoft's Orca](https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/)
18
+
19
+ After uncensoring, deduping, and cleaning, our dataset consists of:
20
+
21
+ - 842,610 instructions of FLANv2 augmented with GPT-4 completions
22
+ - 2,625,353 instructions of FLANv2 augmented with GPT-3.5 completions
23
+
24
+ We followed the submix and system prompt distribution outlined in the Orca paper. With a few exceptions. We included all 75k of CoT in the FLAN-1m dataset rather than sampling that. Also, we found that many items were duplicated, so we removed duplicates, resulting in 3.5m instructs in the ChatGPT dataset.
25
+
26
+ Then we filtered out instances of alignment, refusal, avoidance, and bias, in order to produce an uncensored model upon which can be layered your personalized alignment LoRA.
27
+
28
+ We also filtered out duplicates and cleaned the data.
29
+
30
+ ## Training
31
+ We trained with the flan5m (gpt3.5 completions) dataset in its entirety for 3 epochs at a learning rate of 2e-5 before we stopped training to avoid overfit.
32
+ We trained with the flan1m (gpt4 completions) dataset in its entirety for 2.5 epochs at a learning rate of 1e-5 before we stopped training to avoid overfit.
33
+ It took about 600 hours to train on 8x H100s
34
+ We used a prompt format similar to Vicuna, but we added the SYSTEM: field.
35
+
36
+ Prompt format:
37
+ ```
38
+ SYSTEM: {system}
39
+ USER: {prompt}
40
+ ASSISTANT:
41
+ ```
42
+
43
+ Example:
44
+ ```
45
+ SYSTEM: you are an expert marine biologist.
46
+ USER: Please list 10 ways that dolphins are superior to orcas.
47
+ ASSISTANT:
48
+ ```
49
+
50
+ ## Team
51
+ The core Dolphin Team includes:
52
+ - Eric Hartford
53
+ - Pankaj Mathur
54
+ - Rob "Rohan" O'Callahan
55
+ - Tom "TheBloke" Jobbins
56
+
57
+ ## Gratitude
58
+ - Thank you to Microsoft for authoring the Orca paper and inspiring this work.
59
+ - Special thanks to WingLian, NanoBit, Teknium for helpful advice
60
+ - Special thanks to EdenCoder and chirper.ai for mentorship and financial sponsorship.
61
+ - Special thanks to Kilkonie for his very valued mentorship.
62
+ - Thank you to Catto
63
+ - Thank you to all the other people in the Open Source AI community who have taught me and helped me along the way.