Triangle104
/

EVA-Qwen2.5-14B-v0.2-Q6_K-GGUF

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

Triangle104 commited on 12 days ago

Commit

2b3b746

·

verified ·

1 Parent(s): fd596b9

Update README.md

Files changed (1) hide show

README.md +84 -0

README.md CHANGED Viewed

@@ -26,6 +26,90 @@ model-index:
 This model was converted to GGUF format from [`EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2`](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2) for more details on the model.
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)

 This model was converted to GGUF format from [`EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2`](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2) for more details on the model.
+---
+Model details:
+-
+  A RP/storywriting specialist model, full-parameter finetune of Qwen2.5-14B on mixture of synthetic and natural data.
+  It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve
+versatility, creativity and "flavor" of the resulting model.
+Version notes for 0.2: Now using the refined dataset from 32B
+0.2. Major improvements in coherence, instruction following and
+long-context comprehension over 14B v0.1.
+Prompt format is ChatML.
+Recommended sampler values:
+Temperature: 0.8
+Min-P: 0.05
+Top-A: 0.3
+Repetition Penalty: 1.03
+Recommended SillyTavern presets (via CalamitousFelicitousness):
+Context
+Instruct and System Prompt
+    Training data:
+Celeste 70B 0.1 data mixture minus Opus Instruct subset. See that model's card for details.
+Kalomaze's Opus_Instruct_25k dataset, filtered for refusals.
+A subset (1k rows) of ChatGPT-4o-WritingPrompts by Gryphe
+A subset (2k rows) of Sonnet3.5-Charcards-Roleplay by Gryphe
+Synthstruct and SynthRP datasets by Epiculous
+A subset from Dolphin-2.9.3, including filtered version of not_samantha and a small subset of systemchat.
+     Training time and hardware:
+ 3 hours on 8xH100 SXM, provided by FeatherlessAI
+Model was created by Kearm, Auri and Cahvay.
+Special thanks:
+to Cahvay for his work on investigating and reprocessing the
+corrupted dataset, removing the single biggest source of data poisoning.
+to FeatherlessAI for generously providing 8xH100 SXM node for training of this model
+to Gryphe, Lemmy, Kalomaze, Nopm, Epiculous and CognitiveComputations for the data
+and to Allura-org for support, feedback, beta-testing and doing quality control of EVA models.
+---
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)