README.md · uisikdag/SmolVLM-Instruct-4bit-bitsnbytes-nf4 at b0b1779b6492fdfa57a6e7a25b1dd6c46096eac9

metadata

library_name: transformers
license: apache-2.0
language:
  - en
base_model:
  - HuggingFaceTB/SmolVLM-Instruct

4bit nf4 quantized version, you can find the quantized version generation code below.

The 8bit config seems to be more accurate, when compared to this one.

from transformers import BitsAndBytesConfig


nf4_config = BitsAndBytesConfig(
   load_in_4bit=True,
   bnb_4bit_quant_type="nf4",
   bnb_4bit_use_double_quant=True,
   bnb_4bit_compute_dtype=torch.bfloat16
)

model_nf4 = AutoModelForVision2Seq.from_pretrained("HuggingFaceTB/SmolVLM-Instruct", quantization_config=nf4_config)