Triangle104 commited on
Commit
389a405
·
verified ·
1 Parent(s): 7a57eef

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -0
README.md CHANGED
@@ -17,6 +17,24 @@ tags:
17
  This model was converted to GGUF format from [`Gryphe/Pantheon-Proto-RP-1.8-30B-A3B`](https://huggingface.co/Gryphe/Pantheon-Proto-RP-1.8-30B-A3B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
18
  Refer to the [original model card](https://huggingface.co/Gryphe/Pantheon-Proto-RP-1.8-30B-A3B) for more details on the model.
19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  ## Use with llama.cpp
21
  Install llama.cpp through brew (works on Mac and Linux)
22
 
 
17
  This model was converted to GGUF format from [`Gryphe/Pantheon-Proto-RP-1.8-30B-A3B`](https://huggingface.co/Gryphe/Pantheon-Proto-RP-1.8-30B-A3B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
18
  Refer to the [original model card](https://huggingface.co/Gryphe/Pantheon-Proto-RP-1.8-30B-A3B) for more details on the model.
19
 
20
+ ---
21
+ Ever since Qwen 3 released I've been trying to get MoE finetuning to
22
+ work - After countless frustrating days, much code hacking, etc etc I
23
+ finally got a full finetune to complete with reasonable loss values.
24
+
25
+
26
+ I picked the base model for this since I didn't feel like trying to
27
+ fight a reasoning model's training - Maybe someday I'll make a model
28
+ which uses thinking tags for the character's thoughts or something.
29
+
30
+
31
+ This time the recipe focused on combining as many data sources as I
32
+ possibly could, featuring synthetic data from Sonnet 3.5 + 3.7, ChatGPT
33
+ 4o and Deepseek. These then went through an extensive rewriting pipeline
34
+ to eliminate common AI cliches, with the hopeful intent of providing
35
+ you a fresh experience.
36
+
37
+ ---
38
  ## Use with llama.cpp
39
  Install llama.cpp through brew (works on Mac and Linux)
40