File size: 2,633 Bytes
8b18454
 
 
 
f652bca
 
8b18454
 
 
 
 
f88baa1
f652bca
524b124
f652bca
 
d3b4cb0
 
f652bca
 
 
 
524b124
8b18454
f652bca
 
 
 
 
8b18454
f652bca
8b18454
f652bca
8b18454
f652bca
8b18454
f652bca
 
 
 
 
 
 
 
 
 
 
 
8b18454
f652bca
8b18454
f652bca
8b18454
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
---
base_model:
- allura-org/Gemma-3-Glitter-12B
- ToastyPigeon/Gemma-3-Confetti-12B
- google/gemma-3-12b-it
- google/gemma-3-12b-pt
library_name: transformers
tags:
- mergekit
- merge
---
# 🌠G3 Starshine 12B - Alt🌠
<figure>
  <img src="https://huggingface.co/ToastyPigeon/Gemma-3-Starshine-12B/resolve/main/modelcard_image.jpeg" width="600">
</figure>

*This was Merge A2 in the testing set.*

A creative writing model based on a merge of fine-tunes on Gemma 3 12B IT and Gemma 3 12B PT. 

This is the **RP Focused** merge. This version better handles separate characters in turn-based chats with less impersonation.  

See the main [Story Focused](https://huggingface.co/ToastyPigeon/Gemma-3-Starshine-12B) version as well. 

This is a merge of two G3 models, one trained on instruct and one trained on base: 
* [allura-org/Gemma-3-Glitter-12B](https://huggingface.co/allura-org/Gemma-3-Glitter-12B) - Itself a merge of a storywriting and RP train (both also by ToastyPigeon), on instruct
* [ToastyPigeon/Gemma-3-Confetti-12B](https://huggingface.co/ToastyPigeon/Gemma-3-Confetti-12B) - Experimental application of the Glitter data using base instead of instruct, additionally includes some adventure data in the form of SpringDragon.
  
The result is a lovely blend of Glitter's ability to follow instructions and Confetti's free-spirit prose, effectively 'loosening up' much of the hesitancy that was left in Glitter. 

Vision works (as well as any vision works with this model right now) if you pair a GGUF of this with an appropriate mmproj file; I intend to fix the missing vision tower + make this properly multimodal in the near future. 

*Thank you to [jebcarter](https://huggingface.co/jebcarter) for the idea to make this. I love how it turned out!*

## Instruct Format

Uses Gemma2/3 instruct, but has been trained to recognize an optional system role.

*Note: While it won't immediately balk at the system role, results may be better without it.*

```
<start_of_turn>system
{optional system turn with prompt}<end_of_turn>
<start_of_turn>user
{User messages; can also put sysprompt here to use the built-in g3 training}<end_of_turn>
<start_of_turn>model
{model response}<end_of_turn>
```

### Merge Configuration

A higher percentage of Glitter gives this model better turn-based instruct following, but it may be more uptight compared to the Story Focused version. 

```yaml
models:
  - model: ToastyPigeon/Gemma-3-Confetti-12B
    parameters:
      weight: 0.3
  - model: allura-org/Gemma-3-Glitter-12B
    parameters:
      weight: 0.7
merge_method: linear
tokenizer_source: allura-org/Gemma-3-Glitter-12B

```