medmekk/Llama-3.2-1B-BNB-int4-nf4 (Quantized)

Description

This model is a quantized version of the original model medmekk/Llama-3.2-1B-BNB-int4-nf4. It has been quantized using int4 quantization with bitsandbytes.

Quantization Details

  • Quantization Type: int4
  • Threshold: None
  • bnb_4bit_quant_type: nf4
  • bnb_4bit_use_double_quant: False

Usage

You can use this model in your applications by loading it directly from the Hugging Face Hub:

from transformers import AutoModel

model = AutoModel.from_pretrained("medmekk/Llama-3.2-1B-BNB-int4-nf4")
Downloads last month
16
Safetensors
Model size
764M params
Tensor type
F32
FP16
U8
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for medmekk/Llama-3.2-1B-BNB-int4-nf4

Unable to build the model tree, the base model loops to the model itself. Learn more.