Loading the quantized model using AutoModelForCausalLM
#1
by
NamburiSrinath
- opened
Hi,
I was trying to load
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained('RedHatAI/Llama-2-7b-chat-hf-FP8')
print(model)
but am getting the following error
Traceback (most recent call last):
File "/quantization/compress_vllm.py", line 3, in <module>
model = AutoModelForCausalLM.from_pretrained('RedHatAI/Llama-2-7b-chat-hf-FP8')
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 573, in from_pretrained
return model_class.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 272, in _wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 4278, in from_pretrained
config.quantization_config = AutoHfQuantizer.merge_quantization_configs(
File "/usr/local/lib/python3.10/dist-packages/transformers/quantizers/auto.py", line 198, in merge_quantization_configs
quantization_config = AutoQuantizationConfig.from_dict(quantization_config)
File "/usr/local/lib/python3.10/dist-packages/transformers/quantizers/auto.py", line 128, in from_dict
return target_cls.from_dict(quantization_config_dict)
File "/usr/local/lib/python3.10/dist-packages/transformers/utils/quantization_config.py", line 119, in from_dict
config = cls(**config_dict)
File "/usr/local/lib/python3.10/dist-packages/transformers/utils/quantization_config.py", line 1765, in __init__
self.post_init()
File "/usr/local/lib/python3.10/dist-packages/transformers/utils/quantization_config.py", line 1773, in post_init
raise ValueError(f"Activation scheme {self.activation_scheme} not supported")
ValueError: Activation scheme static not supported
Is this the intended way of loading the model?