Qwen2.5-0.5B quantized to 4-bits per-tensor. Comparable performance to GPTQ (128g desc).
Tasks | Version | Filter | n-shot | Metric | f16 | this | gptq | |
---|---|---|---|---|---|---|---|---|
arc_challenge | 1 | none | 0 | acc | β | 0.2918 | 0.2705 | 0.2730 |
arc_easy | 1 | none | 0 | acc | β | 0.6465 | 0.6393 | 0.6031 |
boolq | 2 | none | 0 | acc | β | 0.6208 | 0.5862 | 0.6232 |
hellaswag | 1 | none | 0 | acc | β | 0.4061 | 0.3888 | 0.3969 |
piqa | 1 | none | 0 | acc | β | 0.7051 | 0.6861 | 0.6801 |
winogrande | 1 | none | 0 | acc | β | 0.5635 | 0.5762 | 0.5659 |
average | acc | β | 0.5390 | 0.5245 | 0.5237 |
To reproduce evals see this colab.
Note: This model is fake quantized and has scaling vectors fused into the weights for ease of evaluation (so the weights are float16 and have > 16 unique values). See the colab above for how to convert to weights with 16 unique values.
- Downloads last month
- 389
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Model tree for smpanaro/Qwen2.5-0.5B-4bit-PerTensor
Base model
Qwen/Qwen2.5-0.5B