Update README.md
Browse files
README.md
CHANGED
@@ -2,30 +2,52 @@
|
|
2 |
base_model:
|
3 |
- allura-org/Gemma-3-Glitter-12B
|
4 |
- ToastyPigeon/Gemma-3-Confetti-12B
|
|
|
|
|
5 |
library_name: transformers
|
6 |
tags:
|
7 |
- mergekit
|
8 |
- merge
|
9 |
-
|
10 |
---
|
11 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
|
13 |
-
This is a merge of
|
|
|
|
|
|
|
|
|
14 |
|
15 |
-
|
16 |
-
### Merge Method
|
17 |
|
18 |
-
|
19 |
|
20 |
-
|
21 |
|
22 |
-
|
23 |
-
|
24 |
-
*
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
|
26 |
-
### Configuration
|
27 |
|
28 |
-
|
29 |
|
30 |
```yaml
|
31 |
models:
|
|
|
2 |
base_model:
|
3 |
- allura-org/Gemma-3-Glitter-12B
|
4 |
- ToastyPigeon/Gemma-3-Confetti-12B
|
5 |
+
- google/gemma-3-12b-it
|
6 |
+
- google/gemma-3-12b-pt
|
7 |
library_name: transformers
|
8 |
tags:
|
9 |
- mergekit
|
10 |
- merge
|
|
|
11 |
---
|
12 |
+
# 🌠G3 Starshine 12B - RPFocus🌠
|
13 |
+
<figure>
|
14 |
+
<img src="https://huggingface.co/ToastyPigeon/Gemma-3-Starshine-12B-StoryFocus/resolve/main/modelcard_image.jpeg" width="600">
|
15 |
+
</figure>
|
16 |
+
|
17 |
+
A creative writing model based on a merge of fine-tunes on Gemma 3 12B IT and Gemma 3 12B PT.
|
18 |
+
|
19 |
+
This is the **RP Focused** merge. This version better handles separate characters in turn-based chats with less impersonation.
|
20 |
+
|
21 |
+
See the [Story Focused](https://huggingface.co/ToastyPigeon/Gemma-3-Starshine-12B-StoryFocus/) version as well.
|
22 |
|
23 |
+
This is a merge of two G3 models, one trained on instruct and one trained on base:
|
24 |
+
* [allura-org/Gemma-3-Glitter-12B](https://huggingface.co/allura-org/Gemma-3-Glitter-12B) - Itself a merge of a storywriting and RP train (both also by ToastyPigeon), on instruct
|
25 |
+
* [ToastyPigeon/Gemma-3-Confetti-12B](https://huggingface.co/ToastyPigeon/Gemma-3-Confetti-12B) - Experimental application of the Glitter data using base instead of instruct, additionally includes some adventure data in the form of SpringDragon.
|
26 |
+
|
27 |
+
The result is a lovely blend of Glitter's ability to follow instructions and Confetti's free-spirit prose, effectively 'loosening up' much of the hesitancy that was left in Glitter.
|
28 |
|
29 |
+
Vision works (as well as any vision works with this model right now) if you pair a GGUF of this with an appropriate mmproj file; I intend to fix the missing vision tower + make this properly multimodal in the near future.
|
|
|
30 |
|
31 |
+
*Thank you to [jebcarter](https://huggingface.co/jebcarter) for the idea to make this. I love how it turned out!*
|
32 |
|
33 |
+
## Instruct Format
|
34 |
|
35 |
+
Uses Gemma2/3 instruct, but has been trained to recognize an optional system role.
|
36 |
+
|
37 |
+
*Note: While it won't immediately balk at the system role, results may be better without it.*
|
38 |
+
|
39 |
+
```
|
40 |
+
<start_of_turn>system
|
41 |
+
{optional system turn with prompt}<end_of_turn>
|
42 |
+
<start_of_turn>user
|
43 |
+
{User messages; can also put sysprompt here to use the built-in g3 training}<end_of_turn>
|
44 |
+
<start_of_turn>model
|
45 |
+
{model response}<end_of_turn>
|
46 |
+
```
|
47 |
|
48 |
+
### Merge Configuration
|
49 |
|
50 |
+
A higher percentage of Glitter gives this model better turn-based instruct following, but it may be more uptight compared to the Story Focused version.
|
51 |
|
52 |
```yaml
|
53 |
models:
|