Triangle104 commited on
Commit
2b3b746
·
verified ·
1 Parent(s): fd596b9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -0
README.md CHANGED
@@ -26,6 +26,90 @@ model-index:
26
  This model was converted to GGUF format from [`EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2`](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
27
  Refer to the [original model card](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2) for more details on the model.
28
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  ## Use with llama.cpp
30
  Install llama.cpp through brew (works on Mac and Linux)
31
 
 
26
  This model was converted to GGUF format from [`EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2`](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
27
  Refer to the [original model card](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2) for more details on the model.
28
 
29
+ ---
30
+ Model details:
31
+ -
32
+
33
+ A RP/storywriting specialist model, full-parameter finetune of Qwen2.5-14B on mixture of synthetic and natural data.
34
+
35
+ It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve
36
+ versatility, creativity and "flavor" of the resulting model.
37
+
38
+
39
+
40
+
41
+
42
+ Version notes for 0.2: Now using the refined dataset from 32B
43
+ 0.2. Major improvements in coherence, instruction following and
44
+ long-context comprehension over 14B v0.1.
45
+
46
+
47
+
48
+
49
+
50
+ Prompt format is ChatML.
51
+
52
+
53
+
54
+ Recommended sampler values:
55
+
56
+
57
+ Temperature: 0.8
58
+ Min-P: 0.05
59
+ Top-A: 0.3
60
+ Repetition Penalty: 1.03
61
+
62
+
63
+
64
+ Recommended SillyTavern presets (via CalamitousFelicitousness):
65
+
66
+
67
+
68
+ Context
69
+ Instruct and System Prompt
70
+
71
+
72
+
73
+
74
+
75
+
76
+
77
+
78
+ Training data:
79
+
80
+
81
+
82
+ Celeste 70B 0.1 data mixture minus Opus Instruct subset. See that model's card for details.
83
+ Kalomaze's Opus_Instruct_25k dataset, filtered for refusals.
84
+ A subset (1k rows) of ChatGPT-4o-WritingPrompts by Gryphe
85
+ A subset (2k rows) of Sonnet3.5-Charcards-Roleplay by Gryphe
86
+ Synthstruct and SynthRP datasets by Epiculous
87
+ A subset from Dolphin-2.9.3, including filtered version of not_samantha and a small subset of systemchat.
88
+
89
+
90
+
91
+ Training time and hardware:
92
+
93
+
94
+
95
+ 3 hours on 8xH100 SXM, provided by FeatherlessAI
96
+
97
+
98
+
99
+
100
+
101
+
102
+ Model was created by Kearm, Auri and Cahvay.
103
+
104
+
105
+ Special thanks:
106
+ to Cahvay for his work on investigating and reprocessing the
107
+ corrupted dataset, removing the single biggest source of data poisoning.
108
+ to FeatherlessAI for generously providing 8xH100 SXM node for training of this model
109
+ to Gryphe, Lemmy, Kalomaze, Nopm, Epiculous and CognitiveComputations for the data
110
+ and to Allura-org for support, feedback, beta-testing and doing quality control of EVA models.
111
+
112
+ ---
113
  ## Use with llama.cpp
114
  Install llama.cpp through brew (works on Mac and Linux)
115