---
base_model:
- OpenGVLab/InternVL2-2B
---

This is the [OpenGVLab/InternVL2-2B](https://huggingface.co/OpenGVLab/InternVL2-2B) model, converted to OpenVINO 
with INT4 compressed weights for the language model, INT8 weights for the other models.

Use OpenVINO GenAI 2025.1 or later and Pillow to run inference on this model:

- `pip install --upgrade openvino-genai pillow`
- Download a test image: `curl -O "https://storage.openvinotoolkit.org/test_data/images/dog.jpg"`
- Run inference:

```python
import numpy as np
import openvino as ov
import openvino_genai
from PIL import Image

# Choose GPU instead of CPU in the line below to run the model on Intel integrated or discrete GPU
pipe = openvino_genai.VLMPipeline("./InternVL2-2B-ov", "CPU")

image = Image.open("dog.jpg")
image_data = np.array(image.getdata()).reshape(1, image.size[1], image.size[0], 3).astype(np.uint8)
image_data = ov.Tensor(image_data)  

prompt = "Can you describe the image?"
result = pipe.generate(prompt, image=image_data, max_new_tokens=100)
print(result.texts[0])
```

See [OpenVINO GenAI repository](https://github.com/openvinotoolkit/openvino.genai?tab=readme-ov-file#performing-visual-language-text-generation)