hawei_LinkedIn commited on
Commit
a5c538b
·
1 Parent(s): 9adf32a

upload model and model card

Browse files
.gitattributes CHANGED
@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ *.json filter=lfs diff=lfs merge=lfs -text
38
+ *.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,110 @@
1
  ---
2
  license: llama3.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: llama3.1
3
+ datasets:
4
+ - nvidia/OpenMathInstruct-2
5
+ language:
6
+ - en
7
+ base_model:
8
+ - meta-llama/Llama-3.1-8B-Instruct
9
+ model-index:
10
+ - name: Control-LLM-Llama3.1-8B-Math16
11
+ results:
12
+ - task:
13
+ type: math-evaluation
14
+ dataset:
15
+ type: parquet
16
+ name: Math, Math Hard, GSM8K
17
+ dataset_kwargs:
18
+ data_files: "https://github.com/linkedin/ControlLLM/blob/main/src/controlllm/inference/llm_eval_harness/additional_tasks/math/joined_math.parquet"
19
+ metrics:
20
+ - name: exact_match,none
21
+ type: exact_match
22
+ value: 0.6205678398534606
23
+ stderr: 0.005249520342473376
24
+ verified: false
25
+ - name: exact_match,none (gsm8k_0shot_instruct)
26
+ type: exact_match
27
+ value: 0.8968915845337376
28
+ stderr: 0.008376436987507811
29
+ verified: false
30
+ - name: exact_match,none (meta_math_0shot_instruct)
31
+ type: exact_match
32
+ value: 0.6166
33
+ stderr: 0.006876797660918556
34
+ verified: false
35
+ - name: exact_match,none (meta_math_hard_0shot_instruct)
36
+ type: exact_match
37
+ value: 0.36027190332326287
38
+ stderr: 0.013198755610252931
39
+ verified: false
40
+ - task:
41
+ type: original-capability
42
+ dataset:
43
+ type: meta/Llama-3.1-8B-Instruct-evals
44
+ name: Llama-3.1-8B-Instruct-evals Dataset
45
+ dataset_path: "meta-llama/llama-3.1-8_b-instruct-evals"
46
+ dataset_name: "Llama-3.1-8B-Instruct-evals__arc_challenge__details"
47
+ metrics:
48
+ - name: exact_match,strict-match
49
+ type: exact_match
50
+ value: 0.6001372485281902
51
+ stderr: 0.002821514831773572
52
+ verified: false
53
+ - name: exact_match,strict-match (meta_arc_0shot_instruct)
54
+ type: exact_match
55
+ value: 0.8248927038626609
56
+ stderr: 0.011139722235859526
57
+ verified: false
58
+ - name: exact_match,strict-match (meta_gpqa_0shot_cot_instruct)
59
+ type: exact_match
60
+ value: 0.3080357142857143
61
+ stderr: 0.021836780796366417
62
+ verified: false
63
+ - name: exact_match,strict-match (meta_mmlu_0shot_instruct)
64
+ type: exact_match
65
+ value: 0.7159948725252813
66
+ stderr: 0.00380556397209409
67
+ verified: false
68
+ - name: exact_match,strict-match (meta_mmlu_pro_5shot_instruct)
69
+ type: exact_match
70
+ value: 0.45403922872340424
71
+ stderr: 0.004539171007529716
72
+ verified: false
73
  ---
74
+ # Control-LLM-Llama3.1-8B-Math16
75
+ This is a fine-tuned model of Llama-3.1-8B-Instruct for mathematical tasks on OpenMath2 dataset.
76
+
77
+ ## Evaluation Results
78
+ Here is an overview of the evaluation results and findings:
79
+
80
+ ### Catastrophic Forgetting on OpenMath
81
+ The following plot illustrates and compares catastrophic forgetting mitigation during training
82
+
83
+ ![Catastrophic Forgetting](plots/ControlLLM_CF_Plot_Math.png)
84
+
85
+ ### Alignment Result
86
+ The plot below highlights the alignment result of the model trained with Control LLM.
87
+
88
+ ![Alignment](plots/alignment_best.png)
89
+
90
+ ### Benchmark Results Table
91
+ The table below summarizes evaluation results across mathematical tasks and original capabilities.
92
+
93
+ | **Model** | **MH** | **M** | **G8K** | **M-Avg** | **ARC** | **GPQA** | **MLU** | **MLUP** | **O-Avg** | **Overall** |
94
+ |-------------------|--------|--------|---------|-----------|---------|----------|---------|----------|-----------|-------------|
95
+ | Llama3.1-8B-Inst | 23.7 | 50.9 | 85.6 | 52.1 | 83.4 | 29.9 | 72.4 | 46.7 | 60.5 | 56.3 |
96
+ | **Control LLM*** | 36.0 | 61.7 | **89.7**| 62.5 | 82.5 | 30.8 | **71.6**| 45.4 | **57.6** | **60.0** |
97
+
98
+ ---
99
+
100
+ ### Explanation:
101
+ - **MH**: MathHard
102
+ - **M**: Math
103
+ - **G8K**: GSM8K
104
+ - **M-Avg**: Math - Average across MathHard, Math, and GSM8K
105
+ - **ARC**: ARC benchmark
106
+ - **GPQA**: General knowledge QA
107
+ - **MLU**: MMLU (Massive Multitask Language Understanding)
108
+ - **MLUP**: MMLU Pro
109
+ - **O-Avg**: Original Capability - Average across ARC, GPQA, MMLU, and MLUP
110
+ - **Overall**: Combined average across all tasks
benchmark_results_code_instruct_mbpp_full_instruct_mbpp_full_0shot_instruct.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:73f28e47a94ecdea8ddfc31a400bd998545805e536d380d5e27be89c127a85d2
3
+ size 25067
benchmark_results_math_instruct.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:84f5ecb155cb0d90f3a674cda875595d0f82875469efaf2ada59b68a476ce682
3
+ size 19758
benchmark_results_meta_mmlu_0shot_instruct_meta_math_hard_0shot.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e76a91f6132c6bc67f61b4f20811be17804c2c63d097609207dd72717ce64f32
3
+ size 8220
benchmark_results_original_capability_instruct.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5fcca219b2d813ae896863459df232a375ee9839d9619300362b6350fc2c5df2
3
+ size 17088
config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8da07d008c75f7dce2f45183c9d5782cdac604434fc8ce7a5f3a8a2652a57b2d
3
+ size 1900
evaluation_results.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2b69a2e19dc8a0f499c05a03b44e85d4835e4676e111cf0b021389e638976058
3
+ size 474
generation_config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6e3dea9cdfa4a3e0edf1b52d9b1e9609aa22efcddce630060b748284663c8e67
3
+ size 177
model-00001-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1dffb3a2cc9aedf7ab9d572cd4556c3341a268a4dd7dcd2f7f7c57554bb222eb
3
+ size 4976715708
model-00002-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6e788da6d01317d0bfe87fe87f37757f9676339f5e823ab9670d2263ab70597d
3
+ size 4999803628
model-00003-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7c6dbb9f45d938bd005593c3a56f86e623035eebcc55c30ea855d5821c5afb00
3
+ size 4915916972
model-00004-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9719edbfeb2b6a7d2be2e95257c8a1e67e385d63c968df2d76c64172c83df9fa
3
+ size 4657956396
model.safetensors.index.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1a9c8d1f60f3aa192fa8cd45f480271fcb13050fac1c8214de4e049759ae32fd
3
+ size 32818
plots/ControlLLM_CF_Plot_Math.png ADDED

Git LFS Details

  • SHA256: db3a5147d350b214310dc9ea340e9f5c5cfb8a5b11cbbe13985978961182e25b
  • Pointer size: 132 Bytes
  • Size of remote file: 1.42 MB
plots/alignment_best.png ADDED

Git LFS Details

  • SHA256: d002f2959cb7be9bdf580b4c8e7c2d4b89c89402196349c608bc2c30ff90b0a9
  • Pointer size: 131 Bytes
  • Size of remote file: 383 kB
special_tokens_map.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:453db79c09538b7953c4d9846d4bc0b46b86a296f285cdecc29f739f0b98f6a9
3
+ size 572
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a200d62d1a1177908f4310d7e367f0194d474db0038dc1f2f2434a3ef74af7d9
3
+ size 17210284
tokenizer_config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9e407a45cd60fbcdad88f1a22adab70157c47e858c0c94995de05e87b06205aa
3
+ size 55820