runtime error
on with `exllama_config`.The value of `use_exllama` will be overwritten by `disable_exllama` passed in `GPTQConfig` or stored in your config file. /home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( /home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/transformers/quantizers/auto.py:167: UserWarning: You passed `quantization_config` or equivalent parameters to `from_pretrained` but the model you're loading already has a `quantization_config` attribute. The `quantization_config` from the model will be used.However, loading attributes (e.g. ['use_cuda_fp16', 'use_exllama', 'max_input_length', 'exllama_config', 'disable_exllama']) will be overwritten with the one you passed to `from_pretrained`. The rest will be ignored. warnings.warn(warning_msg) Traceback (most recent call last): File "/home/user/app/app.py", line 11, in <module> model = AutoModelForCausalLM.from_pretrained(model_repo, quantization_config=quantization_config) File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained return model_class.from_pretrained( File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3197, in from_pretrained hf_********* = AutoHfQuantizer.from_config(config.quantization_config, pre_quantized=pre_quantized) File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/transformers/quantizers/auto.py", line 132, in from_config return target_cls(quantization_config, **kwargs) File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/transformers/quantizers/quantizer_gptq.py", line 47, in __init__ from optimum.gptq import GPTQQuantizer ModuleNotFoundError: No module named 'optimum'
Container logs:
Fetching error logs...