Triangle104
/

Gemma-3-Starshine-12B-Q4_K_S-GGUF

Model card Files Files and versions Community

Gemma-3-Starshine-12B-Q4_K_S-GGUF / README.md

Triangle104's picture

Update README.md

a6b870d verified 3 months ago

|

history blame contribute delete

2.79 kB

	---
	base_model: ToastyPigeon/Gemma-3-Starshine-12B
	library_name: transformers
	tags:
	- mergekit
	- merge
	- llama-cpp
	- gguf-my-repo
	---

	# Triangle104/Gemma-3-Starshine-12B-Q4_K_S-GGUF
	This model was converted to GGUF format from [`ToastyPigeon/Gemma-3-Starshine-12B`](https://huggingface.co/ToastyPigeon/Gemma-3-Starshine-12B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
	Refer to the [original model card](https://huggingface.co/ToastyPigeon/Gemma-3-Starshine-12B) for more details on the model.

	---
	A creative writing model based on a merge of fine-tunes on Gemma 3 12B IT and Gemma 3 12B PT.

	This is the Story Focused merge. This version works
	better for storytelling and scenarios, as the prose is more novel-like
	and it has a tendency to impersonate the user character.

	See the Alternate RP Focused version as well.

	This is a merge of two G3 models, one trained on instruct and one trained on base:

	- allura-org/Gemma-3-Glitter-12B - Itself a merge of a storywriting and RP train (both also by ToastyPigeon), on instruct

	- ToastyPigeon/Gemma-3-Confetti-12B - Experimental application of the Glitter data using base instead of
	instruct, additionally includes some adventure data in the form of
	SpringDragon.

	The result is a lovely blend of Glitter's ability to follow
	instructions and Confetti's free-spirit prose, effectively 'loosening
	up' much of the hesitancy that was left in Glitter.

	---
	## Use with llama.cpp
	Install llama.cpp through brew (works on Mac and Linux)

	```bash
	brew install llama.cpp

	```
	Invoke the llama.cpp server or the CLI.

	### CLI:
	```bash
	llama-cli --hf-repo Triangle104/Gemma-3-Starshine-12B-Q4_K_S-GGUF --hf-file gemma-3-starshine-12b-q4_k_s.gguf -p "The meaning to life and the universe is"
	```

	### Server:
	```bash
	llama-server --hf-repo Triangle104/Gemma-3-Starshine-12B-Q4_K_S-GGUF --hf-file gemma-3-starshine-12b-q4_k_s.gguf -c 2048
	```

	Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.

	Step 1: Clone llama.cpp from GitHub.
	```
	git clone https://github.com/ggerganov/llama.cpp
	```

	Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
	```
	cd llama.cpp && LLAMA_CURL=1 make
	```

	Step 3: Run inference through the main binary.
	```
	./llama-cli --hf-repo Triangle104/Gemma-3-Starshine-12B-Q4_K_S-GGUF --hf-file gemma-3-starshine-12b-q4_k_s.gguf -p "The meaning to life and the universe is"
	```
	or
	```
	./llama-server --hf-repo Triangle104/Gemma-3-Starshine-12B-Q4_K_S-GGUF --hf-file gemma-3-starshine-12b-q4_k_s.gguf -c 2048
	```