tyzhu commited on
Commit
41cdd0e
·
verified ·
1 Parent(s): 90c6a90

Model save

Browse files
Files changed (1) hide show
  1. README.md +27 -29
README.md CHANGED
@@ -3,23 +3,11 @@ license: other
3
  base_model: Qwen/Qwen1.5-4B
4
  tags:
5
  - generated_from_trainer
6
- datasets:
7
- - tyzhu/lmind_nq_train6000_eval6489_v1_docidx_v3
8
  metrics:
9
  - accuracy
10
  model-index:
11
  - name: lmind_nq_train6000_eval6489_v1_docidx_v3_Qwen_Qwen1.5-4B_lora2
12
- results:
13
- - task:
14
- name: Causal Language Modeling
15
- type: text-generation
16
- dataset:
17
- name: tyzhu/lmind_nq_train6000_eval6489_v1_docidx_v3
18
- type: tyzhu/lmind_nq_train6000_eval6489_v1_docidx_v3
19
- metrics:
20
- - name: Accuracy
21
- type: accuracy
22
- value: 0.44876923076923075
23
  library_name: peft
24
  ---
25
 
@@ -28,10 +16,10 @@ should probably proofread and complete it, then remove this comment. -->
28
 
29
  # lmind_nq_train6000_eval6489_v1_docidx_v3_Qwen_Qwen1.5-4B_lora2
30
 
31
- This model is a fine-tuned version of [Qwen/Qwen1.5-4B](https://huggingface.co/Qwen/Qwen1.5-4B) on the tyzhu/lmind_nq_train6000_eval6489_v1_docidx_v3 dataset.
32
  It achieves the following results on the evaluation set:
33
- - Loss: 4.5794
34
- - Accuracy: 0.4488
35
 
36
  ## Model description
37
 
@@ -62,22 +50,32 @@ The following hyperparameters were used during training:
62
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
63
  - lr_scheduler_type: constant
64
  - lr_scheduler_warmup_ratio: 0.05
65
- - num_epochs: 10.0
66
 
67
  ### Training results
68
 
69
- | Training Loss | Epoch | Step | Validation Loss | Accuracy |
70
- |:-------------:|:------:|:----:|:---------------:|:--------:|
71
- | 1.9569 | 0.9985 | 341 | 3.0300 | 0.4736 |
72
- | 1.8799 | 2.0 | 683 | 3.0993 | 0.468 |
73
- | 1.7649 | 2.9985 | 1024 | 3.2750 | 0.4650 |
74
- | 1.6077 | 4.0 | 1366 | 3.4406 | 0.4625 |
75
- | 1.4321 | 4.9985 | 1707 | 3.6500 | 0.4586 |
76
- | 1.2382 | 6.0 | 2049 | 3.8598 | 0.4562 |
77
- | 1.0525 | 6.9985 | 2390 | 4.0638 | 0.4541 |
78
- | 0.8607 | 8.0 | 2732 | 4.2389 | 0.4515 |
79
- | 0.7099 | 8.9985 | 3073 | 4.3484 | 0.4516 |
80
- | 0.5823 | 9.9854 | 3410 | 4.5794 | 0.4488 |
 
 
 
 
 
 
 
 
 
 
81
 
82
 
83
  ### Framework versions
 
3
  base_model: Qwen/Qwen1.5-4B
4
  tags:
5
  - generated_from_trainer
 
 
6
  metrics:
7
  - accuracy
8
  model-index:
9
  - name: lmind_nq_train6000_eval6489_v1_docidx_v3_Qwen_Qwen1.5-4B_lora2
10
+ results: []
 
 
 
 
 
 
 
 
 
 
11
  library_name: peft
12
  ---
13
 
 
16
 
17
  # lmind_nq_train6000_eval6489_v1_docidx_v3_Qwen_Qwen1.5-4B_lora2
18
 
19
+ This model is a fine-tuned version of [Qwen/Qwen1.5-4B](https://huggingface.co/Qwen/Qwen1.5-4B) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 5.3392
22
+ - Accuracy: 0.4286
23
 
24
  ## Model description
25
 
 
50
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
51
  - lr_scheduler_type: constant
52
  - lr_scheduler_warmup_ratio: 0.05
53
+ - num_epochs: 20.0
54
 
55
  ### Training results
56
 
57
+ | Training Loss | Epoch | Step | Accuracy | Validation Loss |
58
+ |:-------------:|:-------:|:----:|:--------:|:---------------:|
59
+ | 1.9569 | 0.9985 | 341 | 0.4736 | 3.0300 |
60
+ | 1.8799 | 2.0 | 683 | 0.468 | 3.0993 |
61
+ | 1.7649 | 2.9985 | 1024 | 0.4650 | 3.2750 |
62
+ | 1.6077 | 4.0 | 1366 | 0.4625 | 3.4406 |
63
+ | 1.4321 | 4.9985 | 1707 | 0.4586 | 3.6500 |
64
+ | 1.2382 | 6.0 | 2049 | 0.4562 | 3.8598 |
65
+ | 1.0525 | 6.9985 | 2390 | 0.4541 | 4.0638 |
66
+ | 0.8607 | 8.0 | 2732 | 0.4515 | 4.2389 |
67
+ | 0.7099 | 8.9985 | 3073 | 0.4516 | 4.3484 |
68
+ | 0.5823 | 9.9854 | 3410 | 0.4488 | 4.5794 |
69
+ | 0.4641 | 10.9985 | 3751 | 4.7090 | 0.4495 |
70
+ | 0.3755 | 12.0 | 4093 | 4.9454 | 0.4354 |
71
+ | 0.3235 | 12.9985 | 4434 | 5.0624 | 0.4379 |
72
+ | 0.2691 | 14.0 | 4776 | 5.0957 | 0.4345 |
73
+ | 0.2394 | 14.9985 | 5117 | 5.1831 | 0.4368 |
74
+ | 0.2112 | 16.0 | 5459 | 5.3223 | 0.4326 |
75
+ | 0.1994 | 16.9985 | 5800 | 5.3839 | 0.4301 |
76
+ | 0.1834 | 18.0 | 6142 | 5.4236 | 0.4286 |
77
+ | 0.1709 | 18.9985 | 6483 | 5.4840 | 0.4291 |
78
+ | 0.166 | 19.9854 | 6820 | 5.3392 | 0.4286 |
79
 
80
 
81
  ### Framework versions