|
--- |
|
library_name: transformers |
|
license: apache-2.0 |
|
language: |
|
- en |
|
base_model: |
|
- HuggingFaceTB/SmolVLM-Instruct |
|
--- |
|
|
|
4bit nf4 quantized version, you can find the quantized version generation code below. |
|
|
|
|
|
``` |
|
from transformers import BitsAndBytesConfig |
|
|
|
|
|
nf4_config = BitsAndBytesConfig( |
|
load_in_4bit=True, |
|
bnb_4bit_quant_type="nf4", |
|
bnb_4bit_use_double_quant=True, |
|
bnb_4bit_compute_dtype=torch.bfloat16 |
|
) |
|
|
|
model_nf4 = AutoModelForVision2Seq.from_pretrained("HuggingFaceTB/SmolVLM-Instruct", quantization_config=nf4_config) |
|
``` |