Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,86 @@
|
|
1 |
-
---
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
tags:
|
3 |
+
- roleplay
|
4 |
+
- Qwen3
|
5 |
+
- fine-tuned
|
6 |
+
- unsloth
|
7 |
+
- 4bit
|
8 |
+
license: apache-2.0
|
9 |
+
library_name: transformers
|
10 |
+
pipeline_tag: text-generation
|
11 |
+
model_name: qwen3-4b-rpg-roleplay
|
12 |
+
---
|
13 |
+
|
14 |
+
<div align="center">
|
15 |
+
<h1>Qwen3-4B Roleplay Fine-Tuned Model</h1>
|
16 |
+
<p>by <strong>Chun (@chun121)</strong></p>
|
17 |
+
<img src="https://img.shields.io/badge/framework-unsloth-important" alt="Unsloth">
|
18 |
+
<img src="https://img.shields.io/badge/quantization-4bit-blue" alt="4bit">
|
19 |
+
<img src="https://img.shields.io/badge/PEFT-LoRA-green" alt="LoRA">
|
20 |
+
</div>
|
21 |
+
|
22 |
+
## ๐ Model Overview
|
23 |
+
|
24 |
+
This repository contains the **Qwen3-4B** model fine-tuned on the **Gryphe-Aesir-RPG-Charcards** mixed split dataset for immersive roleplay scenarios. The model uses **4-bit quantization** and **LoRA** adapters to enable efficient training and inference on consumer-grade GPUs.
|
25 |
+
|
26 |
+
- **Base model**: Qwen3-4B (4-bit, bnb)
|
27 |
+
- **Fine-tuning**: PEFT LoRA, rank=16, ฮฑ=16
|
28 |
+
- **Context window**: 1024 tokens
|
29 |
+
- **Quantization**: GGUF (Q8_0, F16, Q4_K_M)
|
30 |
+
|
31 |
+
## ๐ Training Data
|
32 |
+
|
33 |
+
The model was trained on the [PJMixers-Dev/Gryphe-Aesir-RPG-Charcards-Opus-Mixed-split](https://huggingface.co/datasets/PJMixers-Dev/Gryphe-Aesir-RPG-Charcards-Opus-Mixed-split) dataset, which features:
|
34 |
+
|
35 |
+
- **Type**: Roleplay conversations
|
36 |
+
- **Format**: Parquet, splits `0` and `1` combined
|
37 |
+
- **Content**: Character-driven system prompts, human inputs, and AI responses
|
38 |
+
- **Sensitive**: Contains explicit adult themes; suitable for mature audiences
|
39 |
+
|
40 |
+
## ๐ Usage
|
41 |
+
|
42 |
+
```python
|
43 |
+
from transformers import AutoTokenizer
|
44 |
+
from peft import PeftModel
|
45 |
+
|
46 |
+
# Load LoRA adapters
|
47 |
+
base_model = "unsloth/Qwen3-4B-unsloth-bnb-4bit"
|
48 |
+
adapter_repo = "chun121/qwen3-4B-rpg-roleplay"
|
49 |
+
|
50 |
+
tokenizer = AutoTokenizer.from_pretrained(adapter_repo)
|
51 |
+
model = PeftModel.from_pretrained(base_model, adapter_repo)
|
52 |
+
|
53 |
+
# Inference example
|
54 |
+
inputs = tokenizer("[SYSTEM]: You are a daring adventurer...<eos>", return_tensors="pt").to("cuda")
|
55 |
+
outputs = model.generate(**inputs, max_new_tokens=128)
|
56 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
57 |
+
```
|
58 |
+
|
59 |
+
## ๐ Performance & Evaluation
|
60 |
+
|
61 |
+
- **Training steps**: 200 (with packing)
|
62 |
+
- **Batch size**: 8
|
63 |
+
- **Learning rate**: 1e-4 (cosine LR)
|
64 |
+
- **Hardware**: NVIDIA T4 (Colab free tier)
|
65 |
+
|
66 |
+
## ๐ฆ Files & Formats
|
67 |
+
|
68 |
+
| File/Format | Description |
|
69 |
+
|------------------|-------------------------------------------|
|
70 |
+
| `lora_model/` | Saved LoRA adapters |
|
71 |
+
| `gguf_q8_0/` | GGUF quantized (q8_0) model |
|
72 |
+
| `gguf_f16/` | GGUF quantized (f16) model |
|
73 |
+
| `gguf_q4_k_m/` | GGUF quantized (q4_k_m) model |
|
74 |
+
|
75 |
+
## ๐ Licensing
|
76 |
+
|
77 |
+
This model is released under the Apache-2.0 license. Please abide by the datasetโs terms of use and ensure compliance with any applicable regulations when deploying.
|
78 |
+
|
79 |
+
## ๐ค Acknowledgments
|
80 |
+
|
81 |
+
- Dataset by PJMixers-Dev on Hugging Face
|
82 |
+
- Unsloth & PEFT teams for efficient fine-tuning utilities
|
83 |
+
|
84 |
+
## ๐ฌ Contact
|
85 |
+
|
86 |
+
For questions or collaborations, reach out to **Chun (@chun121)** on Hugging Face.
|