--- base_model: Qwen/Qwen3-4B-Instruct tags: - text-generation-inference - transformers - qwen3 - gguf - ollama - tools - function-calling - character-roleplay - tsundere - conversational-ai - fine-tuned license: apache-2.0 language: - en pipeline_tag: text-generation library_name: transformers --- # 🦊 QwRiko3-4B-Instruct-2507 β€” Tsundere Kitsune AI (GGUF β€’ Ollama β€’ Tools)
## πŸ“‹ Model Overview **QwRiko3-4B-Instruct-2507** is a conversational AI model fine-tuned to embody **Riko**, a tsundere kitsune character. This release targets **GGUF** for **Ollama** first, with solid **tool calling** support when run via Ollama’s tools API. A PyTorch build (Transformers) is also supported. - **Model ID (this repo):** `subsectmusic/qwriko3-4b-instruct-2507` - **Primary format:** **GGUF** (Ollama-compatible) - **Alt format:** PyTorch (Transformers) - **Base Model:** `Qwen/Qwen3-4B-Instruct` - **Parameters:** ~4B - **License:** Apache-2.0 (repo) - **Project:** Project Horizon LLM - **Developer:** @subsectmusic - **Training Framework:** Unsloth + TRL (SFT) ## 🎭 Character Profile: Riko - **Tsundere cadence:** β€œIt’s not like I like you or anything… b-baka!” - **Kitsune vibes:** fox-spirit mischief + sly wisdom - **Emotional core:** tough shell, soft center - **Style:** snappy, teasing, ultimately caring --- ## πŸš€ Quick Start (Ollama β€’ GGUF) > These steps assume you have a local GGUF file named `qwriko3-4b-instruct-2507.Q4_K_M.gguf` in the working directory. If your filename differs, update the `FROM` path in the Modelfile accordingly. 1) **Create a Modelfile** (exact content below is also saved as `Modelfile` in this package): ```Dockerfile # Modelfile FROM ./qwriko3-4b-instruct-2507.Q4_K_M.gguf PARAMETER num_ctx 8192 # (Optional) you can set temperature/top_p/etc. via `ollama run -p` or the API. ``` 2) **Create the Ollama model**: ```bash ollama create qwriko3-4b-instruct-2507 -f Modelfile ``` 3) **Chat**: ```bash ollama run qwriko3-4b-instruct-2507 "Riko, give me a playful hello." ``` ### Tool Calling with Ollama (cURL) ```bash curl http://localhost:11434/api/chat -d '{ "model": "qwriko3-4b-instruct-2507", "messages": [ { "role": "user", "content": "What is the weather today in Toronto?" } ], "tools": [ { "type": "function", "function": { "name": "get_current_weather", "description": "Get the current weather for a location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The location to get the weather for, e.g. Toronto" }, "format": { "type": "string", "description": "Temperature units", "enum": ["celsius", "fahrenheit"] } }, "required": ["location", "format"] } } } ] }' ``` ### Tool Calling with Ollama (Python) A complete, ready-to-run example is saved as `tools_demo.py` in this package. It defines a couple of functions and lets the model call them. You can run it after installing the Python client: ```bash pip install -U ollama python tools_demo.py ``` --- ## πŸ§ͺ Quick Start (Transformers β€’ PyTorch) ```python # Requirements: # pip install "transformers>=4.42.0" "torch>=2.1.0" accelerate # (CUDA recommended; CPU works but is slower.) import torch from transformers import AutoTokenizer, AutoModelForCausalLM MODEL_ID = "subsectmusic/qwriko3-4b-instruct-2507" tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True) model = AutoModelForCausalLM.from_pretrained( MODEL_ID, torch_dtype=torch.float16, device_map="auto" ) messages = [ {"role": "system", "content": "You are Riko, a tsundere kitsune AI. Be witty, teasing, but with hidden warmth."}, {"role": "user", "content": "Hey Riko, how are you today?"} ] if hasattr(tokenizer, "apply_chat_template"): inputs = tokenizer.apply_chat_template( messages, tokenize=True, add_generation_prompt=True, return_tensors="pt" ).to(model.device) else: prompt = ( "System: You are Riko, a tsundere kitsune AI. Be witty, teasing, but with hidden warmth.\n" "User: Hey Riko, how are you today?\n" "Assistant:" ) inputs = tokenizer(prompt, return_tensors="pt").input_ids.to(model.device) gen = model.generate( inputs, max_new_tokens=256, temperature=0.85, top_p=0.9, top_k=50, repetition_penalty=1.1, do_sample=True, pad_token_id=tokenizer.eos_token_id, eos_token_id=tokenizer.eos_token_id, ) out = tokenizer.decode(gen[0][inputs.shape[1]:], skip_special_tokens=True) print("\nRiko:", out.strip()) ``` --- ## πŸ’‘ Use Cases - Character roleplay & entertainment - Creative writing in a tsundere voice - Personality-driven chatbots - Research on alternating-turn distillation & style transfer ## πŸ”¬ Training Summary (SFT) - **Format:** ShareGPT-style β†’ Alpaca single-turn pairs - **Teachers:** Kimi K2 (odd) + Horizon Beta (even) - **Focus:** Tsundere kitsune persona, witty banter, emotional subtext - **Curation:** Manual filtering for tone & safety Example SFT settings: ```yaml Training Framework: Unsloth + TRL SFTTrainer Base Model: Qwen/Qwen3-4B-Instruct Batch Size: 2 per device Gradient Accumulation: 4 Learning Rate: 2e-4 Optimizer: AdamW 8-bit Weight Decay: 0.01 Scheduler: Linear Max Steps: 100+ Warmup Steps: 5 Sequence Length: up to model context Precision: fp16 ``` ## πŸ“Š Specs | Attribute | Details | |------------------|-------------------------------| | Architecture | Qwen3 Transformer | | Parameters | ~4B | | Base | Qwen/Qwen3-4B-Instruct | | Context Length | Base-dependent (Qwen3 config) | | Formats | **GGUF (Ollama)**; PyTorch | | Framework | PyTorch + Transformers | | Optimization | Unsloth-accelerated SFT | | Style | Tsundere kitsune (Riko) | ## 🎯 Recommended Inference Settings ```python generation_config = { "max_new_tokens": 256, "temperature": 0.85, "top_p": 0.9, "top_k": 50, "repetition_penalty": 1.1, "do_sample": True, "pad_token_id": tokenizer.eos_token_id, "eos_token_id": tokenizer.eos_token_id } ``` ## ⚠️ Notes - In-character style can color responses to factual queries - Compact 4B size benefits from clear prompts for complex tasks - Quantization can slightly affect nuance ## πŸ”’ Ethics - Entertainment & creative use; not professional advice - Follow platform/community guidelines ## πŸ“š Citation ```bibtex @model{qwriko3-4b-instruct-2507, title={QwRiko3-4B-Instruct-2507: Tsundere Kitsune AI}, author={subsectmusic}, year={2025}, publisher={Hugging Face}, url={https://huggingface.co/subsectmusic/qwriko3-4b-instruct-2507} } ``` ## 🀝 Acknowledgments - Kimi K2 & Horizon Beta (teachers) - Project Horizon LLM (methodology) - Unsloth, Qwen Team, Hugging Face / TRL - Ollama (GGUF runtime) ---
Made with ❀️ using Unsloth
Training AI personalities, one tsundere at a time!