--- base_model: - Qwen/Qwen2.5-0.5B license: apache-2.0 --- Qwen2.5-0.5B quantized to 4-bits per-tensor. Comparable performance to GPTQ (128g desc). | Tasks |Version|Filter|n-shot| Metric | |f16 |this |gptq | |-------------|------:|------|-----:|--------|---|-----:|-----:|-----:| |arc_challenge| 1|none | 0|acc |↑ |0.2918|0.2705|**0.2730**| |arc_easy | 1|none | 0|acc |↑ |0.6465|**0.6393**|0.6031| |boolq | 2|none | 0|acc |↑ |0.6208|0.5862|**0.6232**| |hellaswag | 1|none | 0|acc |↑ |0.4061|0.3888|**0.3969**| |piqa | 1|none | 0|acc |↑ |0.7051|**0.6861**|0.6801| |winogrande | 1|none | 0|acc |↑ |0.5635|**0.5762**|0.5659| |average | | | |acc |↑ |0.5390|**0.5245**|0.5237| To reproduce evals see this [colab](https://gist.github.com/smpanaro/5890838e424b2970a287e4a05f9049b6). Note: This model is fake quantized and has scaling vectors fused into the weights for ease of evaluation (so the weights are float16 and have > 16 unique values). See the colab above for how to convert to weights with 16 unique values.