|
--- |
|
base_model: |
|
- allura-org/Gemma-3-Glitter-12B |
|
- ToastyPigeon/Gemma-3-Confetti-12B |
|
- google/gemma-3-12b-it |
|
- google/gemma-3-12b-pt |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merge |
|
--- |
|
# 🌠G3 Starshine 12B - Alt🌠 |
|
<figure> |
|
<img src="https://huggingface.co/ToastyPigeon/Gemma-3-Starshine-12B/resolve/main/modelcard_image.jpeg" width="600"> |
|
</figure> |
|
|
|
*This was Merge A2 in the testing set.* |
|
|
|
A creative writing model based on a merge of fine-tunes on Gemma 3 12B IT and Gemma 3 12B PT. |
|
|
|
This is the **RP Focused** merge. This version better handles separate characters in turn-based chats with less impersonation. |
|
|
|
See the main [Story Focused](https://huggingface.co/ToastyPigeon/Gemma-3-Starshine-12B) version as well. |
|
|
|
This is a merge of two G3 models, one trained on instruct and one trained on base: |
|
* [allura-org/Gemma-3-Glitter-12B](https://huggingface.co/allura-org/Gemma-3-Glitter-12B) - Itself a merge of a storywriting and RP train (both also by ToastyPigeon), on instruct |
|
* [ToastyPigeon/Gemma-3-Confetti-12B](https://huggingface.co/ToastyPigeon/Gemma-3-Confetti-12B) - Experimental application of the Glitter data using base instead of instruct, additionally includes some adventure data in the form of SpringDragon. |
|
|
|
The result is a lovely blend of Glitter's ability to follow instructions and Confetti's free-spirit prose, effectively 'loosening up' much of the hesitancy that was left in Glitter. |
|
|
|
Vision works (as well as any vision works with this model right now) if you pair a GGUF of this with an appropriate mmproj file; I intend to fix the missing vision tower + make this properly multimodal in the near future. |
|
|
|
*Thank you to [jebcarter](https://huggingface.co/jebcarter) for the idea to make this. I love how it turned out!* |
|
|
|
## Instruct Format |
|
|
|
Uses Gemma2/3 instruct, but has been trained to recognize an optional system role. |
|
|
|
*Note: While it won't immediately balk at the system role, results may be better without it.* |
|
|
|
``` |
|
<start_of_turn>system |
|
{optional system turn with prompt}<end_of_turn> |
|
<start_of_turn>user |
|
{User messages; can also put sysprompt here to use the built-in g3 training}<end_of_turn> |
|
<start_of_turn>model |
|
{model response}<end_of_turn> |
|
``` |
|
|
|
### Merge Configuration |
|
|
|
A higher percentage of Glitter gives this model better turn-based instruct following, but it may be more uptight compared to the Story Focused version. |
|
|
|
```yaml |
|
models: |
|
- model: ToastyPigeon/Gemma-3-Confetti-12B |
|
parameters: |
|
weight: 0.3 |
|
- model: allura-org/Gemma-3-Glitter-12B |
|
parameters: |
|
weight: 0.7 |
|
merge_method: linear |
|
tokenizer_source: allura-org/Gemma-3-Glitter-12B |
|
|
|
``` |
|
|