Update README.md
Browse files
README.md
CHANGED
@@ -25,16 +25,10 @@ base_model:
|
|
25 |
|
26 |
## Main Message
|
27 |
|
28 |
-
|
29 |
-
|
30 |
-
In our approach, we carefully selected the dense layers from Qwen2.5-0.5B to construct our model. Notably, while Qwen2.5-0.5B was trained on *18 trillion* tokens, our model was trained on only *5 billion* tokens—over three orders of magnitude fewer—yet it achieves comparable performance.
|
31 |
-
|
32 |
-
**Note**: Please note that this model has not yet been instruction-tuned; instruction-tuning is an area of ongoing development.
|
33 |
|
34 |
## Evaluation Results
|
35 |
|
36 |
-
|
37 |
-
|
38 |
### Harness Evaluation
|
39 |
|
40 |
- The performance evaluation is based on the tasks being evaluated on the [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
|
|
|
25 |
|
26 |
## Main Message
|
27 |
|
28 |
+
Here is the instruction-tuned version of the pretrained **Kiwi-1.0-0.7B** model. As can been seen in the table below, the results are at paar with the SOTA Qwen2.5-0.5B.
|
|
|
|
|
|
|
|
|
29 |
|
30 |
## Evaluation Results
|
31 |
|
|
|
|
|
32 |
### Harness Evaluation
|
33 |
|
34 |
- The performance evaluation is based on the tasks being evaluated on the [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
|