ToastyPigeon commited on
Commit
f652bca
·
verified ·
1 Parent(s): 8b18454

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -12
README.md CHANGED
@@ -2,30 +2,52 @@
2
  base_model:
3
  - allura-org/Gemma-3-Glitter-12B
4
  - ToastyPigeon/Gemma-3-Confetti-12B
 
 
5
  library_name: transformers
6
  tags:
7
  - mergekit
8
  - merge
9
-
10
  ---
11
- # merged
 
 
 
 
 
 
 
 
 
12
 
13
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 
 
 
 
14
 
15
- ## Merge Details
16
- ### Merge Method
17
 
18
- This model was merged using the [Linear](https://arxiv.org/abs/2203.05482) merge method.
19
 
20
- ### Models Merged
21
 
22
- The following models were included in the merge:
23
- * [allura-org/Gemma-3-Glitter-12B](https://huggingface.co/allura-org/Gemma-3-Glitter-12B)
24
- * [ToastyPigeon/Gemma-3-Confetti-12B](https://huggingface.co/ToastyPigeon/Gemma-3-Confetti-12B)
 
 
 
 
 
 
 
 
 
25
 
26
- ### Configuration
27
 
28
- The following YAML configuration was used to produce this model:
29
 
30
  ```yaml
31
  models:
 
2
  base_model:
3
  - allura-org/Gemma-3-Glitter-12B
4
  - ToastyPigeon/Gemma-3-Confetti-12B
5
+ - google/gemma-3-12b-it
6
+ - google/gemma-3-12b-pt
7
  library_name: transformers
8
  tags:
9
  - mergekit
10
  - merge
 
11
  ---
12
+ # 🌠G3 Starshine 12B - RPFocus🌠
13
+ <figure>
14
+ <img src="https://huggingface.co/ToastyPigeon/Gemma-3-Starshine-12B-StoryFocus/resolve/main/modelcard_image.jpeg" width="600">
15
+ </figure>
16
+
17
+ A creative writing model based on a merge of fine-tunes on Gemma 3 12B IT and Gemma 3 12B PT.
18
+
19
+ This is the **RP Focused** merge. This version better handles separate characters in turn-based chats with less impersonation.
20
+
21
+ See the [Story Focused](https://huggingface.co/ToastyPigeon/Gemma-3-Starshine-12B-StoryFocus/) version as well.
22
 
23
+ This is a merge of two G3 models, one trained on instruct and one trained on base:
24
+ * [allura-org/Gemma-3-Glitter-12B](https://huggingface.co/allura-org/Gemma-3-Glitter-12B) - Itself a merge of a storywriting and RP train (both also by ToastyPigeon), on instruct
25
+ * [ToastyPigeon/Gemma-3-Confetti-12B](https://huggingface.co/ToastyPigeon/Gemma-3-Confetti-12B) - Experimental application of the Glitter data using base instead of instruct, additionally includes some adventure data in the form of SpringDragon.
26
+
27
+ The result is a lovely blend of Glitter's ability to follow instructions and Confetti's free-spirit prose, effectively 'loosening up' much of the hesitancy that was left in Glitter.
28
 
29
+ Vision works (as well as any vision works with this model right now) if you pair a GGUF of this with an appropriate mmproj file; I intend to fix the missing vision tower + make this properly multimodal in the near future.
 
30
 
31
+ *Thank you to [jebcarter](https://huggingface.co/jebcarter) for the idea to make this. I love how it turned out!*
32
 
33
+ ## Instruct Format
34
 
35
+ Uses Gemma2/3 instruct, but has been trained to recognize an optional system role.
36
+
37
+ *Note: While it won't immediately balk at the system role, results may be better without it.*
38
+
39
+ ```
40
+ <start_of_turn>system
41
+ {optional system turn with prompt}<end_of_turn>
42
+ <start_of_turn>user
43
+ {User messages; can also put sysprompt here to use the built-in g3 training}<end_of_turn>
44
+ <start_of_turn>model
45
+ {model response}<end_of_turn>
46
+ ```
47
 
48
+ ### Merge Configuration
49
 
50
+ A higher percentage of Glitter gives this model better turn-based instruct following, but it may be more uptight compared to the Story Focused version.
51
 
52
  ```yaml
53
  models: