VLLM部署报错
命令:VLLM_WORKER_MULTIPROC_METHOD="spawn" CUDA_VISIBLE_DEVICES=4,5,6,7 python3 -m vllm.entrypoints.openai.api_server --model Qwen/Qwen2.5-VL-72B-Instruct-AWQ --host 0.0.0.0 --port 8000 --pipeline-parallel-size 4
error:
ERROR 02-26 14:32:26 registry.py:306] Error in inspecting model architecture 'Qwen2_5_VLForConditionalGeneration'
ERROR 02-26 14:32:26 registry.py:306] Traceback (most recent call last):
ERROR 02-26 14:32:26 registry.py:306] File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/model_executor/models/registry.py", line 507, in _run_in_subprocess
ERROR 02-26 14:32:26 registry.py:306] returned.check_returncode()
ERROR 02-26 14:32:26 registry.py:306] File "/home/anaconda3/envs/xinference/lib/python3.11/subprocess.py", line 502, in check_returncode
ERROR 02-26 14:32:26 registry.py:306] raise CalledProcessError(self.returncode, self.args, self.stdout,
ERROR 02-26 14:32:26 registry.py:306] subprocess.CalledProcessError: Command '['/home/anaconda3/envs/xinference/bin/python3', '-m', 'vllm.model_executor.models.registry']' returned non-zero exit status 1.
ERROR 02-26 14:32:26 registry.py:306]
ERROR 02-26 14:32:26 registry.py:306] The above exception was the direct cause of the following exception:
ERROR 02-26 14:32:26 registry.py:306]
ERROR 02-26 14:32:26 registry.py:306] Traceback (most recent call last):
ERROR 02-26 14:32:26 registry.py:306] File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/model_executor/models/registry.py", line 304, in _try_inspect_model_cls
ERROR 02-26 14:32:26 registry.py:306] return model.inspect_model_cls()
ERROR 02-26 14:32:26 registry.py:306] ^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 02-26 14:32:26 registry.py:306] File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/model_executor/models/registry.py", line 275, in inspect_model_cls
ERROR 02-26 14:32:26 registry.py:306] return _run_in_subprocess(
ERROR 02-26 14:32:26 registry.py:306] ^^^^^^^^^^^^^^^^^^^
ERROR 02-26 14:32:26 registry.py:306] File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/model_executor/models/registry.py", line 510, in _run_in_subprocess
ERROR 02-26 14:32:26 registry.py:306] raise RuntimeError(f"Error raised in subprocess:\n"
ERROR 02-26 14:32:26 registry.py:306] RuntimeError: Error raised in subprocess:
ERROR 02-26 14:32:26 registry.py:306] /home/anaconda3/envs/xinference/lib/python3.11/site-packages/transformers/utils/hub.py:106: FutureWarning: Using TRANSFORMERS_CACHE
is deprecated and will be removed in v5 of Transformers. Use HF_HOME
instead.
ERROR 02-26 14:32:26 registry.py:306] warnings.warn(
ERROR 02-26 14:32:26 registry.py:306] :128: RuntimeWarning: 'vllm.model_executor.models.registry' found in sys.modules after import of package 'vllm.model_executor.models', but prior to execution of 'vllm.model_executor.models.registry'; this may result in unpredictable behaviour
ERROR 02-26 14:32:26 registry.py:306] Traceback (most recent call last):
ERROR 02-26 14:32:26 registry.py:306] File "", line 198, in _run_module_as_main
ERROR 02-26 14:32:26 registry.py:306] File "", line 88, in _run_code
ERROR 02-26 14:32:26 registry.py:306] File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/model_executor/models/registry.py", line 531, in
ERROR 02-26 14:32:26 registry.py:306] _run()
ERROR 02-26 14:32:26 registry.py:306] File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/model_executor/models/registry.py", line 524, in _run
ERROR 02-26 14:32:26 registry.py:306] result = fn()
ERROR 02-26 14:32:26 registry.py:306] ^^^^
ERROR 02-26 14:32:26 registry.py:306] File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/model_executor/models/registry.py", line 276, in
ERROR 02-26 14:32:26 registry.py:306] lambda: _ModelInfo.from_model_cls(self.load_model_cls()))
ERROR 02-26 14:32:26 registry.py:306] ^^^^^^^^^^^^^^^^^^^^^
ERROR 02-26 14:32:26 registry.py:306] File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/model_executor/models/registry.py", line 279, in load_model_cls
ERROR 02-26 14:32:26 registry.py:306] mod = importlib.import_module(self.module_name)
ERROR 02-26 14:32:26 registry.py:306] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 02-26 14:32:26 registry.py:306] File "/home/anaconda3/envs/xinference/lib/python3.11/importlib/init.py", line 126, in import_module
ERROR 02-26 14:32:26 registry.py:306] return _bootstrap._gcd_import(name[level:], package, level)
ERROR 02-26 14:32:26 registry.py:306] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 02-26 14:32:26 registry.py:306] File "", line 1204, in _gcd_import
ERROR 02-26 14:32:26 registry.py:306] File "", line 1176, in _find_and_load
ERROR 02-26 14:32:26 registry.py:306] File "", line 1147, in _find_and_load_unlocked
ERROR 02-26 14:32:26 registry.py:306] File "", line 690, in _load_unlocked
ERROR 02-26 14:32:26 registry.py:306] File "", line 940, in exec_module
ERROR 02-26 14:32:26 registry.py:306] File "", line 241, in _call_with_frames_removed
ERROR 02-26 14:32:26 registry.py:306] File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/model_executor/models/qwen2_5_vl.py", line 36, in
ERROR 02-26 14:32:26 registry.py:306] from transformers.models.qwen2_5_vl import (Qwen2_5_VLImageProcessor,
ERROR 02-26 14:32:26 registry.py:306] ImportError: cannot import name 'Qwen2_5_VLImageProcessor' from 'transformers.models.qwen2_5_vl' (/home/anaconda3/envs/xinference/lib/python3.11/site-packages/transformers/models/qwen2_5_vl/init.py)
ERROR 02-26 14:32:26 registry.py:306]
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/entrypoints/openai/api_server.py", line 911, in
uvloop.run(run_server(args))
File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/uvloop/init.py", line 105, in run
return runner.run(wrapper())
^^^^^^^^^^^^^^^^^^^^^
File "/home/anaconda3/envs/xinference/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/uvloop/init.py", line 61, in wrapper
return await main
^^^^^^^^^^
File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/entrypoints/openai/api_server.py", line 875, in run_server
async with build_async_engine_client(args) as engine_client:
File "/home/anaconda3/envs/xinference/lib/python3.11/contextlib.py", line 210, in aenter
return await anext(self.gen)
^^^^^^^^^^^^^^^^^^^^^
File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/entrypoints/openai/api_server.py", line 136, in build_async_engine_client
async with build_async_engine_client_from_engine_args(
File "/home/anaconda3/envs/xinference/lib/python3.11/contextlib.py", line 210, in aenter
return await anext(self.gen)
^^^^^^^^^^^^^^^^^^^^^
File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/entrypoints/openai/api_server.py", line 160, in build_async_engine_client_from_engine_args
engine_client = AsyncLLMEngine.from_engine_args(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/engine/async_llm_engine.py", line 639, in from_engine_args
engine_config = engine_args.create_engine_config(usage_context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/engine/arg_utils.py", line 1075, in create_engine_config
model_config = self.create_model_config()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/engine/arg_utils.py", line 998, in create_model_config
return ModelConfig(
^^^^^^^^^^^^
File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/config.py", line 364, in init
self.multimodal_config = self._init_multimodal_config(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/config.py", line 424, in _init_multimodal_config
if ModelRegistry.is_multimodal_model(architectures):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/model_executor/models/registry.py", line 445, in is_multimodal_model
model_cls, _ = self.inspect_model_cls(architectures)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/model_executor/models/registry.py", line 405, in inspect_model_cls
return self._raise_for_unsupported(architectures)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/anaconda3/envs/xinference/lib/python3.11/site-packages/vllm/model_executor/models/registry.py", line 357, in _raise_for_unsupported
raise ValueError(
ValueError: Model architectures ['Qwen2_5_VLForConditionalGeneration'] failed to be inspected. Please check the logs for more details.
@classdemo
To resolve this, do this
https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct-AWQ/discussions/7
And this
pip install --force-reinstall git+https://github.com/huggingface/transformers.git@9985d06add07a4cc691dc54a7e34f54205c04d40