File size: 3,993 Bytes
00045ec 0df5370 00045ec db698dc 00045ec 69b4274 00045ec 515a3c8 00045ec 515a3c8 7ab6351 00045ec 1be2f05 00045ec d104ba7 00045ec 7ab6351 515a3c8 96995d9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
---
license: apache-2.0
base_model: Qwen/Qwen3-8B-Base
library_name: transformers
tags:
- mergekit
- axolotl
- unsloth
- roleplay
- conversational
datasets:
- PygmalionAI/PIPPA
- Alfitaria/nemotron-ultra-reasoning-synthkink
- PocketDoc/Dans-Prosemaxx-Gutenberg
- FreedomIntelligence/Medical-R1-Distill-Data
- cognitivecomputations/SystemChat-2.0
- allenai/tulu-3-sft-personas-instruction-following
- kalomaze/Opus_Instruct_25k
- simplescaling/s1K-claude-3-7-sonnet
- ai2-adapt-dev/flan_v2_converted
- grimulkan/theory-of-mind
- grimulkan/physical-reasoning
- nvidia/HelpSteer3
- nbeerbower/gutenberg2-dpo
- nbeerbower/gutenberg-moderne-dpo
- nbeerbower/Purpura-DPO
- antiven0m/physical-reasoning-dpo
- allenai/tulu-3-IF-augmented-on-policy-70b
- NobodyExistsOnTheInternet/system-message-DPO
---
# Q3-8B-Kintsugi

<small><i>get it? because kintsugi sounds like kitsune? hahaha-</i></small>
# Overview
***Q3-8B-Kintsugi*** is a roleplaying model finetuned from [Qwen3-8B-Base](https://huggingface.co/Qwen/Qwen3-8B-Base).
During testing, Kintsugi punched well above its weight class in terms of parameters, especially for 1-on-1 roleplaying and general storywriting.
# Quantizations
EXL3:
- [Official EXL3 quant repo](https://huggingface.co/allura-quants/allura-org_Q3-8B-Kintsugi-EXL3)
GGUF:
- [Official static GGUF quants](https://huggingface.co/allura-quants/allura-org_Q3-8B-Kintsugi-GGUF)
MLX:
- [8, 6, and 4bpw MLX-formrt quants by soundTeam](https://huggingface.co/collections/allura-quants/q3-8b-kintsugi-mlx-684fc48444f1214749f538c4)
# Usage
- Format is plain-old ChatML (please note that, unlike regular Qwen 3, you do *not* need to prefill empty think tags for it not to reason -- see below).
- Settings used by testers varied, but we generally stayed around 0.9 temperature and 0.1 min p. Do *not* use repetition penalties (DRY included). They break it.
- Any system prompt can likely be used, but I used the Shingame system prompt (link will be added later i promise)
- The official instruction following version of Qwen3-8B was not used as a base. Instruction-following is trained in post-hoc, and "thinking" traces were not included. __As a result of this, "thinking" will not function.__
# Training Process
1. The [base model](https://huggingface.co/Qwen/Qwen3-8B-Base) first went through a supervised finetune on a corpus of instruction following data, roleplay conversations, and human writing based on the [Ink](https://huggingface.co/collections/allura-org/ink-6772fd1442308781594bbabb)/[Bigger Body](https://huggingface.co/collections/allura-org/bigger-body-67b277af0861cec33b54745d)/[Remnant](https://huggingface.co/collections/allura-org/remnant-6817c2113bbb2aed501513d0) lineage.
2. Finally, a KTO reinforcement learning phase steered the model away from the very purple prose the initial merge had, and improved its logical+spatial reasoning and sense of overall "intelligence".
Both stages here are very similar to [Q3-30B-A3B-Designant](https://huggingface.co/allura-org/Q3-30B-A3B-Designant), which went through a very similar process with the same data.
# Credits
- Fizz - Training, Data Wrangling
- Toaster, Mango, Bot, probably others I forgot ;-; - Testing
- inflatebot - original Designant model card that this one was yoinked from
- Artus - Funding
- Alibaba - Making the original model
- Axolotl, Unsloth, Huggingface - Making the frameworks used to train this model (Axolotl was used for the SFT process, and Unsloth+TRL was used for the KTO process)
- All quanters, inside and outside the org, specifically Artus, Lyra, and soundTeam/Heni
We would like to thank the Allura community on Discord, especially Curse, Heni, Artus and Mawnipulator, for their companionship and moral support. You all mean the world to us <3 |