Chun121 commited on
Commit
08542df
ยท
verified ยท
1 Parent(s): d983c20

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +86 -3
README.md CHANGED
@@ -1,3 +1,86 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - roleplay
4
+ - Qwen3
5
+ - fine-tuned
6
+ - unsloth
7
+ - 4bit
8
+ license: apache-2.0
9
+ library_name: transformers
10
+ pipeline_tag: text-generation
11
+ model_name: qwen3-4b-rpg-roleplay
12
+ ---
13
+
14
+ <div align="center">
15
+ <h1>Qwen3-4B Roleplay Fine-Tuned Model</h1>
16
+ <p>by <strong>Chun (@chun121)</strong></p>
17
+ <img src="https://img.shields.io/badge/framework-unsloth-important" alt="Unsloth">
18
+ <img src="https://img.shields.io/badge/quantization-4bit-blue" alt="4bit">
19
+ <img src="https://img.shields.io/badge/PEFT-LoRA-green" alt="LoRA">
20
+ </div>
21
+
22
+ ## ๐ŸŒŸ Model Overview
23
+
24
+ This repository contains the **Qwen3-4B** model fine-tuned on the **Gryphe-Aesir-RPG-Charcards** mixed split dataset for immersive roleplay scenarios. The model uses **4-bit quantization** and **LoRA** adapters to enable efficient training and inference on consumer-grade GPUs.
25
+
26
+ - **Base model**: Qwen3-4B (4-bit, bnb)
27
+ - **Fine-tuning**: PEFT LoRA, rank=16, ฮฑ=16
28
+ - **Context window**: 1024 tokens
29
+ - **Quantization**: GGUF (Q8_0, F16, Q4_K_M)
30
+
31
+ ## ๐Ÿ“š Training Data
32
+
33
+ The model was trained on the [PJMixers-Dev/Gryphe-Aesir-RPG-Charcards-Opus-Mixed-split](https://huggingface.co/datasets/PJMixers-Dev/Gryphe-Aesir-RPG-Charcards-Opus-Mixed-split) dataset, which features:
34
+
35
+ - **Type**: Roleplay conversations
36
+ - **Format**: Parquet, splits `0` and `1` combined
37
+ - **Content**: Character-driven system prompts, human inputs, and AI responses
38
+ - **Sensitive**: Contains explicit adult themes; suitable for mature audiences
39
+
40
+ ## ๐Ÿš€ Usage
41
+
42
+ ```python
43
+ from transformers import AutoTokenizer
44
+ from peft import PeftModel
45
+
46
+ # Load LoRA adapters
47
+ base_model = "unsloth/Qwen3-4B-unsloth-bnb-4bit"
48
+ adapter_repo = "chun121/qwen3-4B-rpg-roleplay"
49
+
50
+ tokenizer = AutoTokenizer.from_pretrained(adapter_repo)
51
+ model = PeftModel.from_pretrained(base_model, adapter_repo)
52
+
53
+ # Inference example
54
+ inputs = tokenizer("[SYSTEM]: You are a daring adventurer...<eos>", return_tensors="pt").to("cuda")
55
+ outputs = model.generate(**inputs, max_new_tokens=128)
56
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
57
+ ```
58
+
59
+ ## ๐Ÿ“ˆ Performance & Evaluation
60
+
61
+ - **Training steps**: 200 (with packing)
62
+ - **Batch size**: 8
63
+ - **Learning rate**: 1e-4 (cosine LR)
64
+ - **Hardware**: NVIDIA T4 (Colab free tier)
65
+
66
+ ## ๐Ÿ“ฆ Files & Formats
67
+
68
+ | File/Format | Description |
69
+ |------------------|-------------------------------------------|
70
+ | `lora_model/` | Saved LoRA adapters |
71
+ | `gguf_q8_0/` | GGUF quantized (q8_0) model |
72
+ | `gguf_f16/` | GGUF quantized (f16) model |
73
+ | `gguf_q4_k_m/` | GGUF quantized (q4_k_m) model |
74
+
75
+ ## ๐Ÿ”‘ Licensing
76
+
77
+ This model is released under the Apache-2.0 license. Please abide by the datasetโ€™s terms of use and ensure compliance with any applicable regulations when deploying.
78
+
79
+ ## ๐Ÿค Acknowledgments
80
+
81
+ - Dataset by PJMixers-Dev on Hugging Face
82
+ - Unsloth & PEFT teams for efficient fine-tuning utilities
83
+
84
+ ## ๐Ÿ’ฌ Contact
85
+
86
+ For questions or collaborations, reach out to **Chun (@chun121)** on Hugging Face.