RDson commited on
Commit
1e45d6b
·
verified ·
1 Parent(s): 56e1665

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -9
README.md CHANGED
@@ -23,10 +23,9 @@ Trained using LLaMA-Factory with the config:
23
  ```
24
  max_seq_length = 6*1024
25
 
26
- lora_rank = 32
27
- lora_alpha = lora_rank * 2
28
- lora_target = ["q_proj", "k_proj", "v_proj", "o_proj",
29
- "gate_proj", "up_proj", "down_proj"]
30
 
31
  args = dict(
32
  stage="sft",
@@ -38,25 +37,26 @@ args = dict(
38
  lora_target=lora_target,
39
  output_dir="qwen_distill_7b_lora",
40
  per_device_train_batch_size=1,
41
- gradient_accumulation_steps=3,
42
  lr_scheduler_type="cosine",
43
  logging_steps=1,
44
- warmup_ratio=0.1,
45
- save_steps=100,
46
  learning_rate=1e-4,
47
  num_train_epochs=1.0,
48
- max_grad_norm=1.0,
49
  loraplus_lr_ratio=16.0,
50
  fp16=True,
51
  report_to="none",
52
  preprocessing_num_workers=16,
53
  cutoff_len=max_seq_length,
 
54
  )
 
55
  ```
56
 
57
  System used:
58
  ```
59
- 'You are a helpful assistant. Please reason step by step inside the tags <think> and </think>. Conclude with **Answer** and put your final answer within \\boxed{}.'
60
  ```
61
 
62
  Custom template used in training:
@@ -89,6 +89,8 @@ register_template(
89
  )
90
  ```
91
 
 
 
92
  In the dataset for variation, I randomly replaced the start of the string "Okay," with one of the following:
93
  ```
94
  starts = [
 
23
  ```
24
  max_seq_length = 6*1024
25
 
26
+ lora_rank = 128
27
+ lora_alpha = lora_rank
28
+ lora_target = "all"
 
29
 
30
  args = dict(
31
  stage="sft",
 
37
  lora_target=lora_target,
38
  output_dir="qwen_distill_7b_lora",
39
  per_device_train_batch_size=1,
40
+ gradient_accumulation_steps=4,
41
  lr_scheduler_type="cosine",
42
  logging_steps=1,
43
+ warmup_ratio=0.05,
 
44
  learning_rate=1e-4,
45
  num_train_epochs=1.0,
46
+ max_grad_norm=0.25,
47
  loraplus_lr_ratio=16.0,
48
  fp16=True,
49
  report_to="none",
50
  preprocessing_num_workers=16,
51
  cutoff_len=max_seq_length,
52
+ optim="paged_adamw_8bit"
53
  )
54
+
55
  ```
56
 
57
  System used:
58
  ```
59
+ 'Please reason step by step inside the <think> and </think> tags, and put your final answer within \\boxed{}.'
60
  ```
61
 
62
  Custom template used in training:
 
89
  )
90
  ```
91
 
92
+ Every entry in the dataset starts with `<think>` and end its reasoning with `</think>`.
93
+
94
  In the dataset for variation, I randomly replaced the start of the string "Okay," with one of the following:
95
  ```
96
  starts = [