Update README.md

Browse files

Files changed (1) hide show

README.md +86 -3

README.md CHANGED Viewed

@@ -1,3 +1,86 @@
----
-license: mit
----

+---
+tags:
+  - roleplay
+  - Qwen3
+  - fine-tuned
+  - unsloth
+  - 4bit
+license: apache-2.0
+library_name: transformers
+pipeline_tag: text-generation
+model_name: qwen3-4b-rpg-roleplay
+---
+<div align="center">
+  <h1>Qwen3-4B Roleplay Fine-Tuned Model</h1>
+  <p>by <strong>Chun (@chun121)</strong></p>
+  <img src="https://img.shields.io/badge/framework-unsloth-important" alt="Unsloth">
+  <img src="https://img.shields.io/badge/quantization-4bit-blue" alt="4bit">
+  <img src="https://img.shields.io/badge/PEFT-LoRA-green" alt="LoRA">
+</div>
+## 🌟 Model Overview
+This repository contains the **Qwen3-4B** model fine-tuned on the **Gryphe-Aesir-RPG-Charcards** mixed split dataset for immersive roleplay scenarios. The model uses **4-bit quantization** and **LoRA** adapters to enable efficient training and inference on consumer-grade GPUs.
+- **Base model**: Qwen3-4B (4-bit, bnb)
+- **Fine-tuning**: PEFT LoRA, rank=16, α=16
+- **Context window**: 1024 tokens
+- **Quantization**: GGUF (Q8_0, F16, Q4_K_M)
+## 📚 Training Data
+The model was trained on the [PJMixers-Dev/Gryphe-Aesir-RPG-Charcards-Opus-Mixed-split](https://huggingface.co/datasets/PJMixers-Dev/Gryphe-Aesir-RPG-Charcards-Opus-Mixed-split) dataset, which features:
+- **Type**: Roleplay conversations
+- **Format**: Parquet, splits `0` and `1` combined
+- **Content**: Character-driven system prompts, human inputs, and AI responses
+- **Sensitive**: Contains explicit adult themes; suitable for mature audiences
+## 🚀 Usage
+```python
+from transformers import AutoTokenizer
+from peft import PeftModel
+# Load LoRA adapters
+base_model = "unsloth/Qwen3-4B-unsloth-bnb-4bit"
+adapter_repo = "chun121/qwen3-4B-rpg-roleplay"
+tokenizer = AutoTokenizer.from_pretrained(adapter_repo)
+model = PeftModel.from_pretrained(base_model, adapter_repo)
+# Inference example
+inputs = tokenizer("[SYSTEM]: You are a daring adventurer...<eos>", return_tensors="pt").to("cuda")
+outputs = model.generate(**inputs, max_new_tokens=128)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+## 📈 Performance & Evaluation
+- **Training steps**: 200 (with packing)
+- **Batch size**: 8
+- **Learning rate**: 1e-4 (cosine LR)
+- **Hardware**: NVIDIA T4 (Colab free tier)
+## 📦 Files & Formats
+| File/Format      | Description                               |
+|------------------|-------------------------------------------|
+| `lora_model/`    | Saved LoRA adapters                       |
+| `gguf_q8_0/`     | GGUF quantized (q8_0) model               |
+| `gguf_f16/`      | GGUF quantized (f16) model                |
+| `gguf_q4_k_m/`   | GGUF quantized (q4_k_m) model             |
+## 🔑 Licensing
+This model is released under the Apache-2.0 license. Please abide by the dataset’s terms of use and ensure compliance with any applicable regulations when deploying.
+## 🤝 Acknowledgments
+- Dataset by PJMixers-Dev on Hugging Face
+- Unsloth & PEFT teams for efficient fine-tuning utilities
+## 💬 Contact
+For questions or collaborations, reach out to **Chun (@chun121)** on Hugging Face.