---
base_model:
- Qwen/Qwen2.5-0.5B
license: apache-2.0
---

Qwen2.5-0.5B quantized to 4-bits per-tensor. Comparable performance to GPTQ (128g desc).

|    Tasks    |Version|Filter|n-shot| Metric |   |f16   |this  |gptq  |
|-------------|------:|------|-----:|--------|---|-----:|-----:|-----:|
|arc_challenge|      1|none  |     0|acc     |↑  |0.2918|0.2705|**0.2730**| 
|arc_easy     |      1|none  |     0|acc     |↑  |0.6465|**0.6393**|0.6031| 
|boolq        |      2|none  |     0|acc     |↑  |0.6208|0.5862|**0.6232**| 
|hellaswag    |      1|none  |     0|acc     |↑  |0.4061|0.3888|**0.3969**| 
|piqa         |      1|none  |     0|acc     |↑  |0.7051|**0.6861**|0.6801| 
|winogrande   |      1|none  |     0|acc     |↑  |0.5635|**0.5762**|0.5659| 
|average      |       |      |      |acc     |↑  |0.5390|**0.5245**|0.5237|


To reproduce evals see this [colab](https://gist.github.com/smpanaro/5890838e424b2970a287e4a05f9049b6).

Note: This model is fake quantized and has scaling vectors fused into the weights for ease of evaluation (so the weights are float16 and have > 16 unique values). See the colab above for how to convert to weights with 16 unique values.