Triangle104
/

EVA-Qwen2.5-7B-v0.1-Q5_K_S-GGUF

GGUF

llama-cpp

gguf-my-repo

conversational

Model card Files Files and versions Community

Triangle104 commited on Jan 18

Commit

71b9553

verified ·

1 Parent(s): 17e63d7

Update README.md

Browse files

Files changed (1) hide show

README.md +34 -377

README.md CHANGED Viewed

@@ -18,409 +18,66 @@ tags:
 ---
 # Triangle104/EVA-Qwen2.5-7B-v0.1-Q5_K_S-GGUF
-This model was converted to GGUF format from [`EVA-UNIT-01/EVA-Qwen2.5-7B-v0.1`](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-7B-v0.1) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
-Refer to the [original model card](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-7B-v0.1) for more details on the model.
 ---
 Model details:
 -
-A RP/storywriting
-specialist model, full-parameter finetune of Qwen2.5-7B on mixture of
-synthetic and natural data.
-It uses Celeste 70B
-0.1 data mixture, greatly expanding it to improve
-versatility,
-creativity and "flavor" of the resulting model.
-Version 0.1 notes:
-Dataset was deduped
-and cleaned from
-version 0.0, and
-learning rate was adjusted. Resulting model seems to be
-stabler, and 0.0
-problems with handling short inputs and min_p sampling
-seem to be mostly
-gone.
-Will be retrained
-once more, because this run crashed around e1.2 (out
-of 3) (thanks,
-DeepSpeed, really appreciate it), and it's still
-somewhat
-undertrained as a result.
-Prompt format is
-ChatML.
-Recommended sampler
-values:
 Temperature: 0.87
 Top-P: 0.81
-Repetition Penalty:
-1.03
-Model appears to
-prefer lower temperatures (at least 0.9 and lower). Min-P seems to
-work now, as well.
-Recommended
-SillyTavern presets (via CalamitousFelicitousness):
 Context
-Instruct and System
-Prompt
-Training data:
-Celeste 70B 0.1 data
-mixture minus Opus Instruct subset. See that model's card for
-details.
-Kalomaze's
-Opus_Instruct_25k dataset, filtered for refusals.
-A subset (1k rows)
-of ChatGPT-4o-WritingPrompts by Gryphe
-A subset (2k rows)
-of Sonnet3.5-Charcards-Roleplay by Gryphe
-A cleaned subset
-(~3k rows) of shortstories_synthlabels by Auri
-Synthstruct and
-SynthRP datasets by Epiculous
-Training time and
-hardware:
-2 days on 4x3090Ti
-(locally)
-Model was trained by
-Kearm and Auri.
 Special thanks:
-to Gryphe, Lemmy,
-Kalomaze, Nopm and Epiculous for the data
-to Alpindale for
-helping with FFT config for Qwen2.5
-and to
-InfermaticAI's community for their continued support for our
-endeavors
 ---
-p { line-height: 115%; margin-bottom: 0.25cm; background: transparent }
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)

 ---
 # Triangle104/EVA-Qwen2.5-7B-v0.1-Q5_K_S-GGUF
 ---
 Model details:
 -
+This model was converted to GGUF format from [`EVA-UNIT-01/EVA-Qwen2.5-7B-v0.1`](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-7B-v0.1) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
+Refer to the [original model card](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-7B-v0.1) for more details on the model.
+  A RP/storywriting specialist model, full-parameter finetune of Qwen2.5-7B on mixture of synthetic and natural data.
+  It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve
+versatility, creativity and "flavor" of the resulting model.
+  Version 0.1 notes:
+ Dataset was deduped and cleaned from
+version 0.0, and learning rate was adjusted. Resulting model seems to be
+ stabler, and 0.0 problems with handling short inputs and min_p sampling
+ seem to be mostly gone.
+  Will be retrained once more, because this run crashed around e1.2 (out
+ of 3) (thanks, DeepSpeed, really appreciate it), and it's still
+somewhat undertrained as a result.
+Prompt format is ChatML.
+Recommended sampler values:
 Temperature: 0.87
 Top-P: 0.81
+Repetition Penalty: 1.03
+Model appears to prefer lower temperatures (at least 0.9 and lower). Min-P seems to work now, as well.
+Recommended SillyTavern presets (via CalamitousFelicitousness):
 Context
+Instruct and System Prompt
+    Training data:
+Celeste 70B 0.1 data mixture minus Opus Instruct subset. See that model's card for details.
+Kalomaze's Opus_Instruct_25k dataset, filtered for refusals.
+A subset (1k rows) of ChatGPT-4o-WritingPrompts by Gryphe
+A subset (2k rows) of Sonnet3.5-Charcards-Roleplay by Gryphe
+A cleaned subset (~3k rows) of shortstories_synthlabels by Auri
+Synthstruct and SynthRP datasets by Epiculous
+     Training time and hardware:
+2 days on 4x3090Ti (locally)
+  Model was trained by Kearm and Auri.
 Special thanks:
+to Gryphe, Lemmy, Kalomaze, Nopm and Epiculous for the data
+to Alpindale for helping with FFT config for Qwen2.5
+and to InfermaticAI's community for their continued support for our endeavors
 ---
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)