|
--- |
|
base_model: Qwen/Qwen3-4B-Instruct |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- qwen3 |
|
- gguf |
|
- ollama |
|
- tools |
|
- function-calling |
|
- character-roleplay |
|
- tsundere |
|
- conversational-ai |
|
- fine-tuned |
|
license: apache-2.0 |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
library_name: transformers |
|
--- |
|
|
|
# π¦ QwRiko3-4B-Instruct-2507 β Tsundere Kitsune AI (GGUF β’ Ollama β’ Tools) |
|
|
|
<div align="center"> |
|
<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/> |
|
</div> |
|
|
|
## π Model Overview |
|
|
|
**QwRiko3-4B-Instruct-2507** is a conversational AI model fine-tuned to embody **Riko**, a tsundere kitsune character. This release targets **GGUF** for **Ollama** first, with solid **tool calling** support when run via Ollamaβs tools API. A PyTorch build (Transformers) is also supported. |
|
|
|
- **Model ID (this repo):** `subsectmusic/qwriko3-4b-instruct-2507` |
|
- **Primary format:** **GGUF** (Ollama-compatible) |
|
- **Alt format:** PyTorch (Transformers) |
|
- **Base Model:** `Qwen/Qwen3-4B-Instruct` |
|
- **Parameters:** ~4B |
|
- **License:** Apache-2.0 (repo) |
|
- **Project:** Project Horizon LLM |
|
- **Developer:** @subsectmusic |
|
- **Training Framework:** Unsloth + TRL (SFT) |
|
|
|
## π Character Profile: Riko |
|
|
|
- **Tsundere cadence:** βItβs not like I like you or anythingβ¦ b-baka!β |
|
- **Kitsune vibes:** fox-spirit mischief + sly wisdom |
|
- **Emotional core:** tough shell, soft center |
|
- **Style:** snappy, teasing, ultimately caring |
|
|
|
--- |
|
|
|
## π Quick Start (Ollama β’ GGUF) |
|
|
|
> These steps assume you have a local GGUF file named `qwriko3-4b-instruct-2507.Q4_K_M.gguf` in the working directory. If your filename differs, update the `FROM` path in the Modelfile accordingly. |
|
|
|
1) **Create a Modelfile** (exact content below is also saved as `Modelfile` in this package): |
|
|
|
```Dockerfile |
|
# Modelfile |
|
FROM ./qwriko3-4b-instruct-2507.Q4_K_M.gguf |
|
PARAMETER num_ctx 8192 |
|
# (Optional) you can set temperature/top_p/etc. via `ollama run -p` or the API. |
|
``` |
|
|
|
2) **Create the Ollama model**: |
|
|
|
```bash |
|
ollama create qwriko3-4b-instruct-2507 -f Modelfile |
|
``` |
|
|
|
3) **Chat**: |
|
|
|
```bash |
|
ollama run qwriko3-4b-instruct-2507 "Riko, give me a playful hello." |
|
``` |
|
|
|
### Tool Calling with Ollama (cURL) |
|
|
|
```bash |
|
curl http://localhost:11434/api/chat -d '{ |
|
"model": "qwriko3-4b-instruct-2507", |
|
"messages": [ |
|
{ "role": "user", "content": "What is the weather today in Toronto?" } |
|
], |
|
"tools": [ |
|
{ |
|
"type": "function", |
|
"function": { |
|
"name": "get_current_weather", |
|
"description": "Get the current weather for a location", |
|
"parameters": { |
|
"type": "object", |
|
"properties": { |
|
"location": { |
|
"type": "string", |
|
"description": "The location to get the weather for, e.g. Toronto" |
|
}, |
|
"format": { |
|
"type": "string", |
|
"description": "Temperature units", |
|
"enum": ["celsius", "fahrenheit"] |
|
} |
|
}, |
|
"required": ["location", "format"] |
|
} |
|
} |
|
} |
|
] |
|
}' |
|
``` |
|
|
|
### Tool Calling with Ollama (Python) |
|
|
|
A complete, ready-to-run example is saved as `tools_demo.py` in this package. It defines a couple of functions and lets the model call them. You can run it after installing the Python client: |
|
|
|
```bash |
|
pip install -U ollama |
|
python tools_demo.py |
|
``` |
|
|
|
--- |
|
|
|
## π§ͺ Quick Start (Transformers β’ PyTorch) |
|
|
|
```python |
|
# Requirements: |
|
# pip install "transformers>=4.42.0" "torch>=2.1.0" accelerate |
|
# (CUDA recommended; CPU works but is slower.) |
|
|
|
import torch |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
MODEL_ID = "subsectmusic/qwriko3-4b-instruct-2507" |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
MODEL_ID, |
|
torch_dtype=torch.float16, |
|
device_map="auto" |
|
) |
|
|
|
messages = [ |
|
{"role": "system", "content": "You are Riko, a tsundere kitsune AI. Be witty, teasing, but with hidden warmth."}, |
|
{"role": "user", "content": "Hey Riko, how are you today?"} |
|
] |
|
|
|
if hasattr(tokenizer, "apply_chat_template"): |
|
inputs = tokenizer.apply_chat_template( |
|
messages, |
|
tokenize=True, |
|
add_generation_prompt=True, |
|
return_tensors="pt" |
|
).to(model.device) |
|
else: |
|
prompt = ( |
|
"System: You are Riko, a tsundere kitsune AI. Be witty, teasing, but with hidden warmth.\n" |
|
"User: Hey Riko, how are you today?\n" |
|
"Assistant:" |
|
) |
|
inputs = tokenizer(prompt, return_tensors="pt").input_ids.to(model.device) |
|
|
|
gen = model.generate( |
|
inputs, |
|
max_new_tokens=256, |
|
temperature=0.85, |
|
top_p=0.9, |
|
top_k=50, |
|
repetition_penalty=1.1, |
|
do_sample=True, |
|
pad_token_id=tokenizer.eos_token_id, |
|
eos_token_id=tokenizer.eos_token_id, |
|
) |
|
out = tokenizer.decode(gen[0][inputs.shape[1]:], skip_special_tokens=True) |
|
print("\nRiko:", out.strip()) |
|
``` |
|
|
|
--- |
|
|
|
## π‘ Use Cases |
|
|
|
- Character roleplay & entertainment |
|
- Creative writing in a tsundere voice |
|
- Personality-driven chatbots |
|
- Research on alternating-turn distillation & style transfer |
|
|
|
## π¬ Training Summary (SFT) |
|
|
|
- **Format:** ShareGPT-style β Alpaca single-turn pairs |
|
- **Teachers:** Kimi K2 (odd) + Horizon Beta (even) |
|
- **Focus:** Tsundere kitsune persona, witty banter, emotional subtext |
|
- **Curation:** Manual filtering for tone & safety |
|
|
|
Example SFT settings: |
|
|
|
```yaml |
|
Training Framework: Unsloth + TRL SFTTrainer |
|
Base Model: Qwen/Qwen3-4B-Instruct |
|
Batch Size: 2 per device |
|
Gradient Accumulation: 4 |
|
Learning Rate: 2e-4 |
|
Optimizer: AdamW 8-bit |
|
Weight Decay: 0.01 |
|
Scheduler: Linear |
|
Max Steps: 100+ |
|
Warmup Steps: 5 |
|
Sequence Length: up to model context |
|
Precision: fp16 |
|
``` |
|
|
|
## π Specs |
|
|
|
| Attribute | Details | |
|
|------------------|-------------------------------| |
|
| Architecture | Qwen3 Transformer | |
|
| Parameters | ~4B | |
|
| Base | Qwen/Qwen3-4B-Instruct | |
|
| Context Length | Base-dependent (Qwen3 config) | |
|
| Formats | **GGUF (Ollama)**; PyTorch | |
|
| Framework | PyTorch + Transformers | |
|
| Optimization | Unsloth-accelerated SFT | |
|
| Style | Tsundere kitsune (Riko) | |
|
|
|
## π― Recommended Inference Settings |
|
|
|
```python |
|
generation_config = { |
|
"max_new_tokens": 256, |
|
"temperature": 0.85, |
|
"top_p": 0.9, |
|
"top_k": 50, |
|
"repetition_penalty": 1.1, |
|
"do_sample": True, |
|
"pad_token_id": tokenizer.eos_token_id, |
|
"eos_token_id": tokenizer.eos_token_id |
|
} |
|
``` |
|
|
|
## β οΈ Notes |
|
|
|
- In-character style can color responses to factual queries |
|
- Compact 4B size benefits from clear prompts for complex tasks |
|
- Quantization can slightly affect nuance |
|
|
|
## π Ethics |
|
|
|
- Entertainment & creative use; not professional advice |
|
- Follow platform/community guidelines |
|
|
|
## π Citation |
|
|
|
```bibtex |
|
@model{qwriko3-4b-instruct-2507, |
|
title={QwRiko3-4B-Instruct-2507: Tsundere Kitsune AI}, |
|
author={subsectmusic}, |
|
year={2025}, |
|
publisher={Hugging Face}, |
|
url={https://huggingface.co/subsectmusic/qwriko3-4b-instruct-2507} |
|
} |
|
``` |
|
|
|
## π€ Acknowledgments |
|
|
|
- Kimi K2 & Horizon Beta (teachers) |
|
- Project Horizon LLM (methodology) |
|
- Unsloth, Qwen Team, Hugging Face / TRL |
|
- Ollama (GGUF runtime) |
|
|
|
--- |
|
|
|
<div align="center"> |
|
<b>Made with β€οΈ using Unsloth</b><br> |
|
<i>Training AI personalities, one tsundere at a time!</i> |
|
</div> |
|
|