medmekk/Llama-3.2-1B-ao-int8wo-gs128 (Quantized)

Quantization Type: int8_weight_only
Group Size: 128

Description

This model is a quantized version of the original model medmekk/Llama-3.2-1B-ao-int8wo-gs128.

It's quantized using the TorchAO library using the torchao-my-repo space.

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Unable to build the model tree, the base model loops to the model itself. Learn more.