OpenGVLab
/

InternVL2-4B

@@ -7,7 +7,7 @@ pipeline_tag: image-text-to-text
 [\[📂 GitHub\]](https://github.com/OpenGVLab/InternVL)  [\[🆕 Blog\]](https://internvl.github.io/blog/)  [\[📜 InternVL 1.0 Paper\]](https://arxiv.org/abs/2312.14238)  [\[📜 InternVL 1.5 Report\]](https://arxiv.org/abs/2404.16821)
-[\[🗨️ Chat Demo\]](https://internvl.opengvlab.com/)  [\[🤗 HF Demo\]](https://huggingface.co/spaces/OpenGVLab/InternVL)  [\[🚀 Quick Start\]](#quick-start)  [\[📖 中文解读\]](https://zhuanlan.zhihu.com/p/675877376)
 ## Introduction
@@ -315,7 +315,132 @@ print(f'Assistant: {response}')
 ### LMDeploy
-> Warning: This model is not yet supported by LMDeploy.
 ## License
@@ -410,7 +535,131 @@ InternVL 2.0 是一个多模态大语言模型系列，包含各种规模的模
 ### LMDeploy
-> 注意：此模型尚未被 LMDeploy 支持。
 ## 开源许可证

 [\[📂 GitHub\]](https://github.com/OpenGVLab/InternVL)  [\[🆕 Blog\]](https://internvl.github.io/blog/)  [\[📜 InternVL 1.0 Paper\]](https://arxiv.org/abs/2312.14238)  [\[📜 InternVL 1.5 Report\]](https://arxiv.org/abs/2404.16821)
+[\[🗨️ Chat Demo\]](https://internvl.opengvlab.com/)  [\[🤗 HF Demo\]](https://huggingface.co/spaces/OpenGVLab/InternVL)  [\[🚀 Quick Start\]](#quick-start)  [\[📖 中文解读\]](https://zhuanlan.zhihu.com/p/706547971)  \[🌟 [魔搭社区](https://modelscope.cn/organization/OpenGVLab) | [教程](https://mp.weixin.qq.com/s/OUaVLkxlk1zhFb1cvMCFjg) \]
 ## Introduction
 ### LMDeploy
+LMDeploy is a toolkit for compressing, deploying, and serving LLM, developed by the MMRazor and MMDeploy teams.
+```sh
+pip install lmdeploy
+```
+LMDeploy abstracts the complex inference process of multi-modal Vision-Language Models (VLM) into an easy-to-use pipeline, similar to the Large Language Model (LLM) inference pipeline.
+#### A 'Hello, world' example
+```python
+from lmdeploy import pipeline, PytorchEngineConfig, ChatTemplateConfig
+from lmdeploy.vl import load_image
+model = 'OpenGVLab/InternVL2-4B'
+system_prompt = '我是书生·万象，英文名是InternVL，是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新，开源开放，共享共创，推动科技进步和产业发展。'
+image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
+chat_template_config = ChatTemplateConfig('internvl-phi3')
+chat_template_config.meta_instruction = system_prompt
+pipe = pipeline(model, chat_template_config=chat_template_config,
+                backend_config=PytorchEngineConfig(session_len=8192))
+response = pipe(('describe this image', image))
+print(response.text)
+```
+If `ImportError` occurs while executing this case, please install the required dependency packages as prompted.
+#### Multi-images inference
+When dealing with multiple images, you can put them all in one list. Keep in mind that multiple images will lead to a higher number of input tokens, and as a result, the size of the context window typically needs to be increased.
+```python
+from lmdeploy import pipeline, PytorchEngineConfig, ChatTemplateConfig
+from lmdeploy.vl import load_image
+from lmdeploy.vl.constants import IMAGE_TOKEN
+model = 'OpenGVLab/InternVL2-4B'
+system_prompt = '我是书生·万象，英文名是InternVL，是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新，开源开放，共享共创，推动科技进步和产业发展。'
+chat_template_config = ChatTemplateConfig('internvl-phi3')
+chat_template_config.meta_instruction = system_prompt
+pipe = pipeline(model, chat_template_config=chat_template_config,
+                backend_config=PytorchEngineConfig(session_len=8192))
+image_urls=[
+    'https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/human-pose.jpg',
+    'https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/det.jpg'
+]
+images = [load_image(img_url) for img_url in image_urls]
+# Numbering images improves multi-image conversations
+response = pipe((f'Image-1: {IMAGE_TOKEN}\nImage-2: {IMAGE_TOKEN}\ndescribe these two images', images))
+print(response.text)
+```
+#### Batch prompts inference
+Conducting inference with batch prompts is quite straightforward; just place them within a list structure:
+```python
+from lmdeploy import pipeline, PytorchEngineConfig, ChatTemplateConfig
+from lmdeploy.vl import load_image
+model = 'OpenGVLab/InternVL2-4B'
+system_prompt = '我是书生·万象，英文名是InternVL，是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新，开源开放，共享共创，推动科技进步和产业发展。'
+chat_template_config = ChatTemplateConfig('internvl-phi3')
+chat_template_config.meta_instruction = system_prompt
+pipe = pipeline(model, chat_template_config=chat_template_config,
+                backend_config=PytorchEngineConfig(session_len=8192))
+image_urls=[
+    "https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/human-pose.jpg",
+    "https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/det.jpg"
+]
+prompts = [('describe this image', load_image(img_url)) for img_url in image_urls]
+response = pipe(prompts)
+print(response)
+```
+#### Multi-turn conversation
+There are two ways to do the multi-turn conversations with the pipeline. One is to construct messages according to the format of OpenAI and use above introduced method, the other is to use the `pipeline.chat` interface.
+```python
+from lmdeploy import pipeline, PytorchEngineConfig, ChatTemplateConfig, GenerationConfig
+from lmdeploy.vl import load_image
+model = 'OpenGVLab/InternVL2-4B'
+system_prompt = '我是书生·万象，英文名是InternVL，是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新，开源开放，共享共创，推动科技进步和产业发展。'
+chat_template_config = ChatTemplateConfig('internvl-phi3')
+chat_template_config.meta_instruction = system_prompt
+pipe = pipeline(model, chat_template_config=chat_template_config,
+                backend_config=PytorchEngineConfig(session_len=8192))
+image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/human-pose.jpg')
+gen_config = GenerationConfig(top_k=40, top_p=0.8, temperature=0.8)
+sess = pipe.chat(('describe this image', image), gen_config=gen_config)
+print(sess.response.text)
+sess = pipe.chat('What is the woman doing?', session=sess, gen_config=gen_config)
+print(sess.response.text)
+```
+#### Service
+For lmdeploy v0.5.0, please configure the chat template config first. Create the following JSON file `chat_template.json`.
+```json
+{
+    "model_name":"internlm2",
+    "meta_instruction":"我是书生·万象，英文名是InternVL，是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新，开源开放，共享共创，推动科技进步和产业发展。",
+    "stop_words":["<|im_start|>", "<|im_end|>"]
+}
+```
+LMDeploy's `api_server` enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup:
+```shell
+lmdeploy serve api_server OpenGVLab/InternVL2-4B --backend pytorch --chat-template chat_template.json
+```
+The default port of `api_server` is `23333`. After the server is launched, you can communicate with server on terminal through `api_client`:
+```shell
+lmdeploy serve api_client http://0.0.0.0:23333
+```
+You can overview and try out `api_server` APIs online by swagger UI at `http://0.0.0.0:23333`, or you can also read the API specification from [here](https://github.com/InternLM/lmdeploy/blob/main/docs/en/serving/restful_api.md).
 ## License
 ### LMDeploy
+LMDeploy 是由 MMRazor 和 MMDeploy 团队开发的用于压缩、部署和服务大语言模型（LLM）的工具包。
+```sh
+pip install lmdeploy
+```
+LMDeploy 将多模态视觉-语言模型（VLM）的复杂推理过程抽象为一个易于使用的管道，类似于大语言模型（LLM）的推理管道。
+#### 一个“你好，世界”示例
+```python
+from lmdeploy import pipeline, PytorchEngineConfig, ChatTemplateConfig
+from lmdeploy.vl import load_image
+model = 'OpenGVLab/InternVL2-4B'
+system_prompt = '我是书生·万象，英文名是InternVL，是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新，开源开放，共享共创，推动科技进步和产业发展。'
+image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
+chat_template_config = ChatTemplateConfig('internvl-phi3')
+chat_template_config.meta_instruction = system_prompt
+pipe = pipeline(model, chat_template_config=chat_template_config,
+                backend_config=PytorchEngineConfig(session_len=8192))
+response = pipe(('describe this image', image))
+print(response.text)
+```
+如果在执行此示例时出现 `ImportError`，请按照提示安装所需的依赖包。
+#### 多图像推理
+在处理多张图像时，可以将它们全部放入一个列表中。请注意，多张图像会导致输入 token 数量增加，因此通常需要增加上下文窗口的大小。
+```python
+from lmdeploy import pipeline, PytorchEngineConfig, ChatTemplateConfig
+from lmdeploy.vl import load_image
+from lmdeploy.vl.constants import IMAGE_TOKEN
+model = 'OpenGVLab/InternVL2-4B'
+system_prompt = '我是书生·万象，英文名是InternVL，是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新，开源开放，共享共创，推动科技进步和产业发展。'
+chat_template_config = ChatTemplateConfig('internvl-phi3')
+chat_template_config.meta_instruction = system_prompt
+pipe = pipeline(model, chat_template_config=chat_template_config,
+                backend_config=PytorchEngineConfig(session_len=8192))
+image_urls=[
+    'https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/human-pose.jpg',
+    'https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/det.jpg'
+]
+images = [load_image(img_url) for img_url in image_urls]
+response = pipe((f'Image-1: {IMAGE_TOKEN}\nImage-2: {IMAGE_TOKEN}\ndescribe these two images', images))
+print(response.text)
+```
+#### 批量Prompt推理
+使用批量Prompt进行推理非常简单；只需将它们放在一个列表结构中：
+```python
+from lmdeploy import pipeline, PytorchEngineConfig, ChatTemplateConfig
+from lmdeploy.vl import load_image
+model = 'OpenGVLab/InternVL2-4B'
+system_prompt = '我是书生·万象，英文名是InternVL，是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新，开源开放，共享共创，推动科技进步和产业发展。'
+chat_template_config = ChatTemplateConfig('internvl-phi3')
+chat_template_config.meta_instruction = system_prompt
+pipe = pipeline(model, chat_template_config=chat_template_config,
+                backend_config=PytorchEngineConfig(session_len=8192))
+image_urls=[
+    "https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/human-pose.jpg",
+    "https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/det.jpg"
+]
+prompts = [('describe this image', load_image(img_url)) for img_url in image_urls]
+response = pipe(prompts)
+print(response)
+```
+#### 多轮对话
+使用管道进行多轮对话有两种方法。一种是根据 OpenAI 的格式构建消息并使用上述方法，另一种是使用 `pipeline.chat` 接口。
+```python
+from lmdeploy import pipeline, PytorchEngineConfig, ChatTemplateConfig, GenerationConfig
+from lmdeploy.vl import load_image
+model = 'OpenGVLab/InternVL2-4B'
+system_prompt = '我是书生·万象，英文名是InternVL，是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新，开源开放，共享共创，推动科技进步和产业发展。'
+chat_template_config = ChatTemplateConfig('internvl-phi3')
+chat_template_config.meta_instruction = system_prompt
+pipe = pipeline(model, chat_template_config=chat_template_config,
+                backend_config=PytorchEngineConfig(session_len=8192))
+image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/human-pose.jpg')
+gen_config = GenerationConfig(top_k=40, top_p=0.8, temperature=0.8)
+sess = pipe.chat(('describe this image', image), gen_config=gen_config)
+print(sess.response.text)
+sess = pipe.chat('What is the woman doing?', session=sess, gen_config=gen_config)
+print(sess.response.text)
+```
+#### API部署
+对于 lmdeploy v0.5.0，请先配置聊天模板配置文件。创建如下的 JSON 文件 `chat_template.json`。
+```json
+{
+    "model_name":"internlm2",
+    "meta_instruction":"我是书生·万象，英文名是InternVL，是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新，开源开放，共享共创，推动科技进步和产业发展。",
+    "stop_words":["<|im_start|>", "<|im_end|>"]
+}
+```
+LMDeploy 的 `api_server` 使模型能够通过一个命令轻松打包成服务。提供的 RESTful API 与 OpenAI 的接口兼容。以下是服务启动的示例：
+```shell
+lmdeploy serve api_server OpenGVLab/InternVL2-4B --backend pytorch --chat-template chat_template.json
+```
+`api_server` 的默认端口是 `23333`。服务器启动后，你可以通过 `api_client` 在终端与服务器通信：
+```shell
+lmdeploy serve api_client http://0.0.0.0:23333
+```
+你可以通过 `http://0.0.0.0:23333` 的 swagger UI 在线查看和试用 `api_server` 的 API，也可以从 [这里](https://github.com/InternLM/lmdeploy/blob/main/docs/en/serving/restful_api.md) 阅读 API 规范。
 ## 开源许可证