Update README.md
Browse files
README.md
CHANGED
@@ -37,9 +37,9 @@ A QLoRA (Quantized Low-Rank Adaptation) approach was used for efficient fine-tun
|
|
37 |
### Training Data
|
38 |
The training process involved two main stages:
|
39 |
|
40 |
-
1. **Domain
|
41 |
|
42 |
-
2. **Instruction & Chat Fine-tuning:** The model was
|
43 |
* **1,400 Roleplay Conversations:** Multi-turn conversational examples designed to teach the model how to adopt and maintain character personas from the series.
|
44 |
* **800 Assistant Examples:** Instruction-response pairs focused on answering lore questions and following commands within the context of the *Wings of Fire* world.
|
45 |
|
|
|
37 |
### Training Data
|
38 |
The training process involved two main stages:
|
39 |
|
40 |
+
1. **Domain training:** The base model was adapted to the *Wings of Fire* universe using a custom dataset of **3 million tokens** compiled directly from the book series. This step saturated the model with the specific lore, characters, and writing style of the source material.
|
41 |
|
42 |
+
2. **Instruction & Chat Fine-tuning:** The model was fine-tuned on a mixed dataset of **2,200 examples**:
|
43 |
* **1,400 Roleplay Conversations:** Multi-turn conversational examples designed to teach the model how to adopt and maintain character personas from the series.
|
44 |
* **800 Assistant Examples:** Instruction-response pairs focused on answering lore questions and following commands within the context of the *Wings of Fire* world.
|
45 |
|