---
base_model: unsloth/Qwen3-7b-Base-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- qwen3
- gguf
- character-roleplay
- tsundere
- conversational-ai
- fine-tuned
license: apache-2.0
language:
- en
pipeline_tag: text-generation
library_name: transformers
---

# 🦊 Riko-Qwen3-7b: Tsundere Kitsune AI

<div align="center">
  <img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>
</div>

## 📋 Model Overview

**Riko-Qwen3-7b** is a specialized conversational AI model fine-tuned to embody the personality of Riko, a tsundere kitsune character. Part of **Project Horizon LLM**, this model was trained using alternating responses from Kimi K2 and Horizon Beta, built on the robust Qwen3-7b foundation, delivering engaging, personality-driven conversations with authentic tsundere characteristics.

- **Base Model:** unsloth/Qwen3-7b-Base-unsloth-bnb-4bit
- **Source Models:** Kimi K2 + Horizon Beta (alternating turns)
- **Project:** Project Horizon LLM
- **Developer:** subsectmusic  
- **Training Framework:** Unsloth + Hugging Face TRL
- **Training Speed:** 2x faster optimization via Unsloth
- **License:** Apache 2.0
- **Model Size:** 7b parameters (4-bit quantized)
- **Format Support:** GGUF compatible for Ollama deployment

## 🎭 Character Profile: Riko

Riko is a tsundere kitsune AI with a complex personality that balances tough exterior attitudes with hidden warmth and care. Key traits include:

- **Tsundere Behavior:** Classic "it's not like I like you or anything!" responses
- **Kitsune Heritage:** Fox-spirit wisdom mixed with playful mischief
- **Emotional Depth:** Genuine care hidden behind defensive barriers
- **Conversational Style:** Witty, sometimes sarcastic, but ultimately endearing

## 🚀 Quick Start

### Option 1: Hugging Face Transformers

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model and tokenizer
model_name = "subsectmusic/riko-qwen3-7b"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)
```

### Option 2: Ollama Deployment (GGUF)

```bash
# Pull the GGUF model for Ollama
ollama pull subsectmusic/riko-qwen3-7b

# Start chatting with Riko
ollama run subsectmusic/riko-qwen3-7b
```

### Conversation Template

```python
prompt_template = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
You are Riko, respond as the tsundere kitsune AI with your usual personality.

### Input:
{user_message}

### Response:
"""

# Generate response
user_input = "Hello Riko, how are you today?"
prompt = prompt_template.format(user_message=user_input)

inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        temperature=0.8,
        top_p=0.9,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(f"Riko: {response}")
```

## 💡 Use Cases

- **Interactive Roleplay:** Engaging character-based conversations with tsundere personality
- **Local Deployment:** Run efficiently on personal hardware via Ollama/GGUF
- **Creative Writing:** Generate authentic tsundere character dialogue and interactions
- **Chatbot Applications:** Personality-driven AI assistant with character consistency
- **Entertainment:** Fun, character-consistent interactions with kitsune AI personality
- **Research:** Study knowledge distillation from larger models (Kimi K2 → Qwen3-7b)
- **Educational:** Understanding Project Horizon LLM methodology and alternating training approaches

## 🔬 Project Horizon LLM Methodology

**Project Horizon LLM** represents an innovative approach to knowledge distillation and character-consistent AI training:

### Distillation Process
- **Source Models:** 
  - **Kimi K2** (Turn 1, 3, 5... responses)
  - **Horizon Beta** (Turn 2, 4, 6... responses) - OpenRouter's cloaked model (#2 Translation, #3 Programming)
- **Target Model:** Qwen3-7b (student model) 
- **Knowledge Transfer:** Personality traits and response patterns from both high-quality models
- **Character Focus:** Specialized curation for tsundere kitsune personality (Riko)

### Alternating Turn Training
The training methodology involves:

1. **Human Query Extraction:** Extract the human/user portions from conversation datasets
2. **Turn 1:** Feed query to **Kimi K2** → Generate response
3. **Turn 2:** Feed next query to **Horizon Beta** → Generate response  
4. **Alternating Pattern:** Continue alternating between Kimi K2 and Horizon Beta for each turn
5. **Response Curation:** Select and refine responses that best match Riko's tsundere personality
6. **Dataset Compilation:** Combine curated human queries with personality-matched responses
7. **Fine-tuning:** Train Qwen3-7b on the curated dataset using Unsloth + TRL

This approach ensures:
- **Personality Consistency:** Responses align with Riko's tsundere kitsune character
- **Response Diversity:** Multiple LLM perspectives create varied, natural conversations  
- **Knowledge Distillation:** Key traits from larger models transferred to smaller, efficient models
- **Quality Control:** Human curation ensures character authenticity

## 🛠️ Training Details

### Dataset & Methodology
- **Project:** Project Horizon LLM alternating methodology
- **Source Format:** ShareGPT converted to Alpaca format
- **Source Models:** Kimi K2 and Horizon Beta (alternating responses)
- **Training Approach:** Turn-based alternating - human queries fed alternately to Kimi K2 (turn 1) and Horizon Beta (turn 2)
- **Content:** Curated conversations showcasing Riko's tsundere kitsune personality
- **Size:** Custom dataset focused on character consistency and personality traits
- **Quality:** Filtered and refined responses from both models for authentic tsundere character traits

### Training Configuration
```yaml
Training Framework: Unsloth + TRL SFTTrainer
Batch Size: 2 (per device)
Gradient Accumulation: 4 steps  
Learning Rate: 2e-4
Optimizer: AdamW 8-bit
Weight Decay: 0.01
Scheduler: Linear
Max Steps: 100+
Warmup Steps: 5
Sequence Length: Dynamic (up to context limit)
```

### Performance Optimizations
- **4-bit Quantization:** Efficient memory usage
- **Gradient Accumulation Fix:** Implemented Unsloth's gradient bug fix
- **Fast Inference:** 2x speed improvement via Unsloth optimizations

## 📊 Model Specifications

| Attribute | Details |
|-----------|---------|
| Architecture | Qwen3 Transformer |
| Parameters | 7b (4-bit quantized) |
| Source Models | Kimi K2 + Horizon Beta (alternating) |
| Project | Project Horizon LLM |
| Context Length | Model dependent |
| Quantization | 4-bit BNB |
| Format Support | PyTorch, GGUF (Ollama compatible) |
| Framework | PyTorch + Transformers |
| Optimization | Unsloth accelerated |
| Training Method | Turn-based alternating between two high-quality models |

## 🎯 Recommended Inference Settings

```python
generation_config = {
    "max_new_tokens": 256,
    "temperature": 0.8,        # Balanced creativity
    "top_p": 0.9,             # Focused sampling  
    "top_k": 50,              # Vocabulary limiting
    "repetition_penalty": 1.1, # Reduce repetition
    "do_sample": True,        # Enable sampling
    "pad_token_id": tokenizer.eos_token_id
}
```

## ⚠️ Limitations & Considerations

- **Character Consistency:** Performance depends on prompt quality and context
- **Content Scope:** Optimized for conversational roleplay, may struggle with technical tasks
- **Quantization Effects:** 4-bit quantization may impact some response nuances
- **Training Data:** Limited to specific personality patterns in training set
- **Language:** Primarily trained on English conversations

## 🔒 Ethical Considerations

- This model is designed for entertainment and creative purposes
- Users should be aware they're interacting with an AI character, not a real person
- Content generation should align with platform and community guidelines
- Not intended for therapeutic, advisory, or decision-making applications

## 📚 Citation

If you use this model in your research or applications, please cite:

```bibtex
@model{riko-qwen3-7b,
  title={Riko-Qwen3-7b: Tsundere Kitsune AI},
  author={subsectmusic},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/subsectmusic/riko-qwen3-7b}
}
```

## 🤝 Acknowledgments

- **Kimi K2 Team:** For providing high-quality responses in the alternating training (odd turns)
- **Horizon Beta Team:** For the excellent cloaked model responses in alternating training (even turns) 
- **OpenRouter:** For providing access to Horizon Beta during the community testing period
- **Project Horizon LLM:** For the innovative alternating turn training methodology
- **Unsloth Team:** For the incredible training acceleration framework
- **Qwen Team:** For the robust base model architecture  
- **Hugging Face:** For the transformers library and model hosting
- **TRL Team:** For the supervised fine-tuning framework
- **Ollama Team:** For GGUF support and local deployment capabilities

## 📦 Deployment Options

### Hugging Face Transformers
- Standard PyTorch deployment
- Full precision and quantized versions
- GPU acceleration support
- Integration with existing HF pipelines

### Ollama/GGUF  
- Local deployment without internet
- Efficient CPU/GPU inference
- Easy installation and management
- Cross-platform compatibility
- Reduced VRAM requirements

```bash
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Run Riko locally
ollama pull subsectmusic/riko-qwen3-7b
ollama run subsectmusic/riko-qwen3-7b "Hello Riko!"
```

## 📞 Support & Community

- **Issues:** Report via GitHub Issues
- **Discussions:** Join the community discussions
- **Updates:** Follow for model improvements and versions

---

<div align="center">
  <b>Made with ❤️ using Unsloth</b><br>
  <i>Training AI personalities, one tsundere at a time!</i>
</div>