README.md · subsectmusic/qwriko3-4b-instruct-2507 at main

qwriko3-4b-instruct-2507 / README.md

subsectmusic

Update README.md

19feb4e verified 2 days ago

preview code

raw

history blame contribute delete

7.45 kB

	---
	base_model: Qwen/Qwen3-4B-Instruct
	tags:
	- text-generation-inference
	- transformers
	- qwen3
	- gguf
	- ollama
	- tools
	- function-calling
	- character-roleplay
	- tsundere
	- conversational-ai
	- fine-tuned
	license: apache-2.0
	language:
	- en
	pipeline_tag: text-generation
	library_name: transformers
	---

	# 🦊 QwRiko3-4B-Instruct-2507 — Tsundere Kitsune AI (GGUF • Ollama • Tools)

	<div align="center">
	<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>
	</div>

	## 📋 Model Overview

	QwRiko3-4B-Instruct-2507 is a conversational AI model fine-tuned to embody Riko, a tsundere kitsune character. This release targets GGUF for Ollama first, with solid tool calling support when run via Ollama’s tools API. A PyTorch build (Transformers) is also supported.

	- Model ID (this repo): `subsectmusic/qwriko3-4b-instruct-2507`
	- Primary format: GGUF (Ollama-compatible)
	- Alt format: PyTorch (Transformers)
	- Base Model: `Qwen/Qwen3-4B-Instruct`
	- Parameters: ~4B
	- License: Apache-2.0 (repo)
	- Project: Project Horizon LLM
	- Developer: @subsectmusic
	- Training Framework: Unsloth + TRL (SFT)

	## 🎭 Character Profile: Riko

	- Tsundere cadence: “It’s not like I like you or anything… b-baka!”
	- Kitsune vibes: fox-spirit mischief + sly wisdom
	- Emotional core: tough shell, soft center
	- Style: snappy, teasing, ultimately caring

	---

	## 🚀 Quick Start (Ollama • GGUF)

	> These steps assume you have a local GGUF file named `qwriko3-4b-instruct-2507.Q4_K_M.gguf` in the working directory. If your filename differs, update the `FROM` path in the Modelfile accordingly.

	1) Create a Modelfile (exact content below is also saved as `Modelfile` in this package):

	```Dockerfile
	# Modelfile
	FROM ./qwriko3-4b-instruct-2507.Q4_K_M.gguf
	PARAMETER num_ctx 8192
	# (Optional) you can set temperature/top_p/etc. via `ollama run -p` or the API.
	```

	2) Create the Ollama model:

	```bash
	ollama create qwriko3-4b-instruct-2507 -f Modelfile
	```

	3) Chat:

	```bash
	ollama run qwriko3-4b-instruct-2507 "Riko, give me a playful hello."
	```

	### Tool Calling with Ollama (cURL)

	```bash
	curl http://localhost:11434/api/chat -d '{
	"model": "qwriko3-4b-instruct-2507",
	"messages": [
	{ "role": "user", "content": "What is the weather today in Toronto?" }
	],
	"tools": [
	{
	"type": "function",
	"function": {
	"name": "get_current_weather",
	"description": "Get the current weather for a location",
	"parameters": {
	"type": "object",
	"properties": {
	"location": {
	"type": "string",
	"description": "The location to get the weather for, e.g. Toronto"
	},
	"format": {
	"type": "string",
	"description": "Temperature units",
	"enum": ["celsius", "fahrenheit"]
	}
	},
	"required": ["location", "format"]
	}
	}
	}
	]
	}'
	```

	### Tool Calling with Ollama (Python)

	A complete, ready-to-run example is saved as `tools_demo.py` in this package. It defines a couple of functions and lets the model call them. You can run it after installing the Python client:

	```bash
	pip install -U ollama
	python tools_demo.py
	```

	---

	## 🧪 Quick Start (Transformers • PyTorch)

	```python
	# Requirements:
	# pip install "transformers>=4.42.0" "torch>=2.1.0" accelerate
	# (CUDA recommended; CPU works but is slower.)

	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM

	MODEL_ID = "subsectmusic/qwriko3-4b-instruct-2507"

	tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True)
	model = AutoModelForCausalLM.from_pretrained(
	MODEL_ID,
	torch_dtype=torch.float16,
	device_map="auto"
	)

	messages = [
	{"role": "system", "content": "You are Riko, a tsundere kitsune AI. Be witty, teasing, but with hidden warmth."},
	{"role": "user", "content": "Hey Riko, how are you today?"}
	]

	if hasattr(tokenizer, "apply_chat_template"):
	inputs = tokenizer.apply_chat_template(
	messages,
	tokenize=True,
	add_generation_prompt=True,
	return_tensors="pt"
	).to(model.device)
	else:
	prompt = (
	"System: You are Riko, a tsundere kitsune AI. Be witty, teasing, but with hidden warmth.\n"
	"User: Hey Riko, how are you today?\n"
	"Assistant:"
	)
	inputs = tokenizer(prompt, return_tensors="pt").input_ids.to(model.device)

	gen = model.generate(
	inputs,
	max_new_tokens=256,
	temperature=0.85,
	top_p=0.9,
	top_k=50,
	repetition_penalty=1.1,
	do_sample=True,
	pad_token_id=tokenizer.eos_token_id,
	eos_token_id=tokenizer.eos_token_id,
	)
	out = tokenizer.decode(gen[0][inputs.shape[1]:], skip_special_tokens=True)
	print("\nRiko:", out.strip())
	```

	---

	## 💡 Use Cases

	- Character roleplay & entertainment
	- Creative writing in a tsundere voice
	- Personality-driven chatbots
	- Research on alternating-turn distillation & style transfer

	## 🔬 Training Summary (SFT)

	- Format: ShareGPT-style → Alpaca single-turn pairs
	- Teachers: Kimi K2 (odd) + Horizon Beta (even)
	- Focus: Tsundere kitsune persona, witty banter, emotional subtext
	- Curation: Manual filtering for tone & safety

	Example SFT settings:

	```yaml
	Training Framework: Unsloth + TRL SFTTrainer
	Base Model: Qwen/Qwen3-4B-Instruct
	Batch Size: 2 per device
	Gradient Accumulation: 4
	Learning Rate: 2e-4
	Optimizer: AdamW 8-bit
	Weight Decay: 0.01
	Scheduler: Linear
	Max Steps: 100+
	Warmup Steps: 5
	Sequence Length: up to model context
	Precision: fp16
	```

	## 📊 Specs

	\| Attribute \| Details \|
	\|------------------\|-------------------------------\|
	\| Architecture \| Qwen3 Transformer \|
	\| Parameters \| ~4B \|
	\| Base \| Qwen/Qwen3-4B-Instruct \|
	\| Context Length \| Base-dependent (Qwen3 config) \|
	\| Formats \| GGUF (Ollama); PyTorch \|
	\| Framework \| PyTorch + Transformers \|
	\| Optimization \| Unsloth-accelerated SFT \|
	\| Style \| Tsundere kitsune (Riko) \|

	## 🎯 Recommended Inference Settings

	```python
	generation_config = {
	"max_new_tokens": 256,
	"temperature": 0.85,
	"top_p": 0.9,
	"top_k": 50,
	"repetition_penalty": 1.1,
	"do_sample": True,
	"pad_token_id": tokenizer.eos_token_id,
	"eos_token_id": tokenizer.eos_token_id
	}
	```

	## ⚠️ Notes

	- In-character style can color responses to factual queries
	- Compact 4B size benefits from clear prompts for complex tasks
	- Quantization can slightly affect nuance

	## 🔒 Ethics

	- Entertainment & creative use; not professional advice
	- Follow platform/community guidelines

	## 📚 Citation

	```bibtex
	@model{qwriko3-4b-instruct-2507,
	title={QwRiko3-4B-Instruct-2507: Tsundere Kitsune AI},
	author={subsectmusic},
	year={2025},
	publisher={Hugging Face},
	url={https://huggingface.co/subsectmusic/qwriko3-4b-instruct-2507}
	}
	```

	## 🤝 Acknowledgments

	- Kimi K2 & Horizon Beta (teachers)
	- Project Horizon LLM (methodology)
	- Unsloth, Qwen Team, Hugging Face / TRL
	- Ollama (GGUF runtime)

	---

	<div align="center">
	<b>Made with ❤️ using Unsloth</b><br>
	<i>Training AI personalities, one tsundere at a time!</i>
	</div>