rwmasood's picture
Update README.md
cd0a577 verified
metadata
tags:
  - Qwen
  - instruct
  - instruction
  - empirischtech
license: cc-by-4.0
metrics:
  - lm-evaluation-harness
base_model:
  - empirischtech/Kiwi-1.0-0.7B-32k
datasets:
  - tatsu-lab/alpaca
  - BAAI/Infinity-Instruct

Kiwi-1.0-0.7B-32k-Instruct

Instruction-Tuned Model

Main Message

Here is the instruction-tuned version of the pretrained Kiwi-1.0-0.7B model. As can been seen in the table below, the results are at paar with the SOTA Qwen2.5-0.5B.

Evaluation Results

The performance evaluation is based on the tasks being evaluated on the Open LLM Leaderboard. The model is evaluated on three benchmark datasets, which include ARC-Challenge, HellaSwag, MMLU, IFEval and GPQA. The library used is lm-evaluation-harness repository

Main Results

Metric Qwen2.5-05B-Instruct Kiwi-1.0-0.7B-32k-Instruct
ARC 33.45 32.34
HellaSwag 52.37 48.59
MMLU-PRO 14.03 12.89
IFEval 37.53 27.1
GPQA (Diamond)
(Zero-shot CoT)
12.27 17.17
Average 29,93 27,27