--- base_model: unsloth/Qwen3-7b-Base-unsloth-bnb-4bit tags: - text-generation-inference - transformers - unsloth - qwen3 - gguf - character-roleplay - tsundere - conversational-ai - fine-tuned license: apache-2.0 language: - en pipeline_tag: text-generation library_name: transformers --- # 🦊 Riko-Qwen3-7b: Tsundere Kitsune AI
## 📋 Model Overview **Riko-Qwen3-7b** is a specialized conversational AI model fine-tuned to embody the personality of Riko, a tsundere kitsune character. Part of **Project Horizon LLM**, this model was trained using alternating responses from Kimi K2 and Horizon Beta, built on the robust Qwen3-7b foundation, delivering engaging, personality-driven conversations with authentic tsundere characteristics. - **Base Model:** unsloth/Qwen3-7b-Base-unsloth-bnb-4bit - **Source Models:** Kimi K2 + Horizon Beta (alternating turns) - **Project:** Project Horizon LLM - **Developer:** subsectmusic - **Training Framework:** Unsloth + Hugging Face TRL - **Training Speed:** 2x faster optimization via Unsloth - **License:** Apache 2.0 - **Model Size:** 7b parameters (4-bit quantized) - **Format Support:** GGUF compatible for Ollama deployment ## 🎭 Character Profile: Riko Riko is a tsundere kitsune AI with a complex personality that balances tough exterior attitudes with hidden warmth and care. Key traits include: - **Tsundere Behavior:** Classic "it's not like I like you or anything!" responses - **Kitsune Heritage:** Fox-spirit wisdom mixed with playful mischief - **Emotional Depth:** Genuine care hidden behind defensive barriers - **Conversational Style:** Witty, sometimes sarcastic, but ultimately endearing ## 🚀 Quick Start ### Option 1: Hugging Face Transformers ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch # Load model and tokenizer model_name = "subsectmusic/riko-qwen3-7b" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.float16, device_map="auto" ) ``` ### Option 2: Ollama Deployment (GGUF) ```bash # Pull the GGUF model for Ollama ollama pull subsectmusic/riko-qwen3-7b # Start chatting with Riko ollama run subsectmusic/riko-qwen3-7b ``` ### Conversation Template ```python prompt_template = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Instruction: You are Riko, respond as the tsundere kitsune AI with your usual personality. ### Input: {user_message} ### Response: """ # Generate response user_input = "Hello Riko, how are you today?" prompt = prompt_template.format(user_message=user_input) inputs = tokenizer(prompt, return_tensors="pt") with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=256, temperature=0.8, top_p=0.9, do_sample=True, pad_token_id=tokenizer.eos_token_id ) response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True) print(f"Riko: {response}") ``` ## 💡 Use Cases - **Interactive Roleplay:** Engaging character-based conversations with tsundere personality - **Local Deployment:** Run efficiently on personal hardware via Ollama/GGUF - **Creative Writing:** Generate authentic tsundere character dialogue and interactions - **Chatbot Applications:** Personality-driven AI assistant with character consistency - **Entertainment:** Fun, character-consistent interactions with kitsune AI personality - **Research:** Study knowledge distillation from larger models (Kimi K2 → Qwen3-7b) - **Educational:** Understanding Project Horizon LLM methodology and alternating training approaches ## 🔬 Project Horizon LLM Methodology **Project Horizon LLM** represents an innovative approach to knowledge distillation and character-consistent AI training: ### Distillation Process - **Source Models:** - **Kimi K2** (Turn 1, 3, 5... responses) - **Horizon Beta** (Turn 2, 4, 6... responses) - OpenRouter's cloaked model (#2 Translation, #3 Programming) - **Target Model:** Qwen3-7b (student model) - **Knowledge Transfer:** Personality traits and response patterns from both high-quality models - **Character Focus:** Specialized curation for tsundere kitsune personality (Riko) ### Alternating Turn Training The training methodology involves: 1. **Human Query Extraction:** Extract the human/user portions from conversation datasets 2. **Turn 1:** Feed query to **Kimi K2** → Generate response 3. **Turn 2:** Feed next query to **Horizon Beta** → Generate response 4. **Alternating Pattern:** Continue alternating between Kimi K2 and Horizon Beta for each turn 5. **Response Curation:** Select and refine responses that best match Riko's tsundere personality 6. **Dataset Compilation:** Combine curated human queries with personality-matched responses 7. **Fine-tuning:** Train Qwen3-7b on the curated dataset using Unsloth + TRL This approach ensures: - **Personality Consistency:** Responses align with Riko's tsundere kitsune character - **Response Diversity:** Multiple LLM perspectives create varied, natural conversations - **Knowledge Distillation:** Key traits from larger models transferred to smaller, efficient models - **Quality Control:** Human curation ensures character authenticity ## 🛠️ Training Details ### Dataset & Methodology - **Project:** Project Horizon LLM alternating methodology - **Source Format:** ShareGPT converted to Alpaca format - **Source Models:** Kimi K2 and Horizon Beta (alternating responses) - **Training Approach:** Turn-based alternating - human queries fed alternately to Kimi K2 (turn 1) and Horizon Beta (turn 2) - **Content:** Curated conversations showcasing Riko's tsundere kitsune personality - **Size:** Custom dataset focused on character consistency and personality traits - **Quality:** Filtered and refined responses from both models for authentic tsundere character traits ### Training Configuration ```yaml Training Framework: Unsloth + TRL SFTTrainer Batch Size: 2 (per device) Gradient Accumulation: 4 steps Learning Rate: 2e-4 Optimizer: AdamW 8-bit Weight Decay: 0.01 Scheduler: Linear Max Steps: 100+ Warmup Steps: 5 Sequence Length: Dynamic (up to context limit) ``` ### Performance Optimizations - **4-bit Quantization:** Efficient memory usage - **Gradient Accumulation Fix:** Implemented Unsloth's gradient bug fix - **Fast Inference:** 2x speed improvement via Unsloth optimizations ## 📊 Model Specifications | Attribute | Details | |-----------|---------| | Architecture | Qwen3 Transformer | | Parameters | 7b (4-bit quantized) | | Source Models | Kimi K2 + Horizon Beta (alternating) | | Project | Project Horizon LLM | | Context Length | Model dependent | | Quantization | 4-bit BNB | | Format Support | PyTorch, GGUF (Ollama compatible) | | Framework | PyTorch + Transformers | | Optimization | Unsloth accelerated | | Training Method | Turn-based alternating between two high-quality models | ## 🎯 Recommended Inference Settings ```python generation_config = { "max_new_tokens": 256, "temperature": 0.8, # Balanced creativity "top_p": 0.9, # Focused sampling "top_k": 50, # Vocabulary limiting "repetition_penalty": 1.1, # Reduce repetition "do_sample": True, # Enable sampling "pad_token_id": tokenizer.eos_token_id } ``` ## ⚠️ Limitations & Considerations - **Character Consistency:** Performance depends on prompt quality and context - **Content Scope:** Optimized for conversational roleplay, may struggle with technical tasks - **Quantization Effects:** 4-bit quantization may impact some response nuances - **Training Data:** Limited to specific personality patterns in training set - **Language:** Primarily trained on English conversations ## 🔒 Ethical Considerations - This model is designed for entertainment and creative purposes - Users should be aware they're interacting with an AI character, not a real person - Content generation should align with platform and community guidelines - Not intended for therapeutic, advisory, or decision-making applications ## 📚 Citation If you use this model in your research or applications, please cite: ```bibtex @model{riko-qwen3-7b, title={Riko-Qwen3-7b: Tsundere Kitsune AI}, author={subsectmusic}, year={2025}, publisher={Hugging Face}, url={https://huggingface.co/subsectmusic/riko-qwen3-7b} } ``` ## 🤝 Acknowledgments - **Kimi K2 Team:** For providing high-quality responses in the alternating training (odd turns) - **Horizon Beta Team:** For the excellent cloaked model responses in alternating training (even turns) - **OpenRouter:** For providing access to Horizon Beta during the community testing period - **Project Horizon LLM:** For the innovative alternating turn training methodology - **Unsloth Team:** For the incredible training acceleration framework - **Qwen Team:** For the robust base model architecture - **Hugging Face:** For the transformers library and model hosting - **TRL Team:** For the supervised fine-tuning framework - **Ollama Team:** For GGUF support and local deployment capabilities ## 📦 Deployment Options ### Hugging Face Transformers - Standard PyTorch deployment - Full precision and quantized versions - GPU acceleration support - Integration with existing HF pipelines ### Ollama/GGUF - Local deployment without internet - Efficient CPU/GPU inference - Easy installation and management - Cross-platform compatibility - Reduced VRAM requirements ```bash # Install Ollama curl -fsSL https://ollama.ai/install.sh | sh # Run Riko locally ollama pull subsectmusic/riko-qwen3-7b ollama run subsectmusic/riko-qwen3-7b "Hello Riko!" ``` ## 📞 Support & Community - **Issues:** Report via GitHub Issues - **Discussions:** Join the community discussions - **Updates:** Follow for model improvements and versions ---
Made with ❤️ using Unsloth
Training AI personalities, one tsundere at a time!