Update README.md
Browse files
README.md
CHANGED
@@ -18,15 +18,35 @@ Use the following dataset to fine-tune codellama/CodeLlama-13B in order to impro
|
|
18 |
|
19 |
- jondurbin/airoboros-2.2: Filter categories related to coding, reasoning and planning.
|
20 |
- Open-Orca/OpenOrca: Filter the 'cot' category in 1M GPT4 dataset.
|
21 |
-
- garage-bAInd/Open-Platypus: 100%
|
22 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
| Metric | Value |
|
24 |
| --- | --- |
|
25 |
-
| ARC | |
|
26 |
-
| HellaSwag | |
|
27 |
-
| MMLU | |
|
28 |
-
| TruthfulQA | |
|
29 |
-
| Average | |
|
30 |
|
31 |
|
32 |
# **Code Llama**
|
|
|
18 |
|
19 |
- jondurbin/airoboros-2.2: Filter categories related to coding, reasoning and planning.
|
20 |
- Open-Orca/OpenOrca: Filter the 'cot' category in 1M GPT4 dataset.
|
21 |
+
- garage-bAInd/Open-Platypus: 100%
|
22 |
|
23 |
+
|
24 |
+
| Metric | Value |
|
25 |
+
| --- | --- |
|
26 |
+
| humaneval-python | 49.39 |
|
27 |
+
|
28 |
+
[Big Code Models Leaderboard](https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard)
|
29 |
+
|
30 |
+
CodeLlama-34B-Python: 53.29
|
31 |
+
|
32 |
+
CodeLlama-34B-Instruct: 50.79
|
33 |
+
|
34 |
+
CodeLlama-13B-Instruct: 50.6
|
35 |
+
|
36 |
+
CodeLlama-34B: 45.11
|
37 |
+
|
38 |
+
CodeLlama-13B-Python: 42.89
|
39 |
+
|
40 |
+
CodeLlama-13B: 35.07
|
41 |
+
|
42 |
+
[Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
43 |
| Metric | Value |
|
44 |
| --- | --- |
|
45 |
+
| ARC | 44.88 |
|
46 |
+
| HellaSwag | 67.7 |
|
47 |
+
| MMLU | 43.16 |
|
48 |
+
| TruthfulQA | 40.88 |
|
49 |
+
| Average | 49.15 |
|
50 |
|
51 |
|
52 |
# **Code Llama**
|