๐Ÿ› Playground | ๐Ÿ“„ Technical report | ๐Ÿ’ป GitHub | ๐Ÿ‘€ Sign up for the API

AtlaAI/Selene-1-Mini-Llama-3.1-8B-GPTQ-W8A8

This model was quantised into an 8-bit (W8A8) format using GPTQ and SmoothQuant from AtlaAI/Selene-1-Mini-Llama-3.1-8B. This was done using vLLM's llm-compressor library (https://docs.vllm.ai/en/stable/features/quantization/int8.html)

Refer to the original model card for more details on the model.

This quantisation was calibrated using a sample of 512 datapoints from the data used to train Selene-1-Mini. As a result, our quantised models show minimal performance degradation, losing <0.5% overall across benchmarks!

For reference, a GPTQ quantized 8-bit Llama-3.1-8B shows ~1.5% degradation across benchmarks.

image/png

Downloads last month
14
Safetensors
Model size
8.03B params
Tensor type
BF16
ยท
I8
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for AtlaAI/Selene-1-Mini-Llama-3.1-8B-GPTQ-W8A8

Collection including AtlaAI/Selene-1-Mini-Llama-3.1-8B-GPTQ-W8A8