Update README.md
Browse files
README.md
CHANGED
@@ -45,18 +45,18 @@ This model has been fine-tuned using LoRA (Low-Rank Adaptation) technique on a c
|
|
45 |
- **Learning Rate**: 2e-4
|
46 |
- **Batch Size**: 6 per device
|
47 |
- **Gradient Accumulation**: 1 step
|
48 |
-
- **Warmup Steps**:
|
49 |
- **Weight Decay**: 0.01
|
50 |
- **LR Scheduler**: linear
|
51 |
- **Optimizer**: paged_adamw_8bit
|
52 |
- **Precision**: bfloat16
|
53 |
|
54 |
### LoRA Configuration
|
55 |
-
- **LoRA Rank**:
|
56 |
- **LoRA Alpha**: 32
|
57 |
-
- **Target Modules**:
|
58 |
-
- **Dropout**: 0.
|
59 |
-
- **Max Sequence Length**:
|
60 |
|
61 |
### Dataset
|
62 |
- **Size**: 25,650 examples
|
|
|
45 |
- **Learning Rate**: 2e-4
|
46 |
- **Batch Size**: 6 per device
|
47 |
- **Gradient Accumulation**: 1 step
|
48 |
+
- **Warmup Steps**: 5
|
49 |
- **Weight Decay**: 0.01
|
50 |
- **LR Scheduler**: linear
|
51 |
- **Optimizer**: paged_adamw_8bit
|
52 |
- **Precision**: bfloat16
|
53 |
|
54 |
### LoRA Configuration
|
55 |
+
- **LoRA Rank**: 32
|
56 |
- **LoRA Alpha**: 32
|
57 |
+
- **Target Modules**: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
|
58 |
+
- **Dropout**: 0.05
|
59 |
+
- **Max Sequence Length**: 4096
|
60 |
|
61 |
### Dataset
|
62 |
- **Size**: 25,650 examples
|