medmekk/Llama-3.2-3B-Instruct-BNB-INT4 (Quantized)

Description

This model is a quantized version of the original model medmekk/Llama-3.2-3B-Instruct-BNB-INT4. It has been quantized using int4 quantization with bitsandbytes.

Quantization Details

  • Quantization Type: int4
  • bnb_4bit_quant_type: fp4
  • bnb_4bit_use_double_quant: False
  • bnb_4bit_compute_dtype: float32
  • bnb_4bit_quant_storage: uint8

Usage

You can use this model in your applications by loading it directly from the Hugging Face Hub:

from transformers import AutoModel

model = AutoModel.from_pretrained("medmekk/Llama-3.2-3B-Instruct-BNB-INT4")
Downloads last month
34
Safetensors
Model size
1.85B params
Tensor type
F32
F16
U8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for medmekk/Llama-3.2-3B-Instruct-BNB-INT4

Unable to build the model tree, the base model loops to the model itself. Learn more.