---
language:
- en
pipeline_tag: text-generation
---
# Bespoke-MiniChart-7B
[](https://colab.research.google.com/drive/1FEmlwGgn9209iQO-rs2-9UHPLoytwZMH?usp=sharing)
This is an open‑source chart‑understanding Vision‑Language Model (VLM) developed at [Bespoke Labs](https://www.bespokelabs.ai/) and maintained by [Liyan Tang](https://www.tangliyan.com/) and Bespoke Labs. It sets a new state‑of‑the‑art in chart question‑answering (Chart‑QA) for 7 billion‑parameter models, outperforming much larger closed models such as Gemini‑1.5‑Pro and Claude‑3.5 on seven public benchmarks.
Please check our blog for more information about how we trained the model
# Model Performance
Our model achieves state-of-the-art performance on chart understanding among models with similar sizes. In addition to that, our models can even surpass closed-models such as Gemini-1.5-Pro and Claude-3.5.
| Model / Category | ChartQAPro (1637) | ChartQA (2500) | EvoChart (1250) | CharXiv (4000) | ChartX (1152) | ChartBench (2100) | MMC (808) | Average |
|----------------------------------------|------------------:|---------------:|----------------:|---------------:|--------------:|------------------:|----------:|--------:|
| **Open-Models (11 B and less)** | | | | | | | | |
| InternVL-2.5-8B | –| **78.2** | 53.0 | 55.7 | 49.5 | 44.7 | **85.5** | – |
| Qwen2-VL-7B | –| **82.1** | 54.5 | 53.5 | 50.8 | 50.8 | **83.9** | – |
| Qwen2.5-VL-7B | **53.5** | **86.0** | 67.9 | 60.9 | 67.0 | 61.4 | **86.0** | **69.0**|
| **Ours** | | | | | | | | |
| Bespoke-MiniChart-7B | **56.7** | **89.5** | **71.8** | **66.4** | **68.9** | **66.1** | **88.4** | **72.5**|
| **Open-Models (32 B and more)** | | | | | | | | |
| QVQ-72B-Preview | –| **84.2** | 65.0 | 59.0 | 60.9 | 53.8 | **83.4** | – |
| Qwen2.5-VL-32B | **58.4** | **89.5** | 74.3 | 66.9 | 64.5 | 59.8 | **89.6** | **71.9**|
| Qwen2.5-VL-72B | **59.0** | **90.0** | **76.8** | 67.1 | **67.2** | 61.5 | **91.2** | **73.3**|
| **Closed-Models** | | | | | | | | |
| GPT-4o | **53.6** | 85.7 | 71.7 | 67.8 | 54.3 | 46.1 | **89.1** | 66.9 |
| Gemini-1.5-flash | **53.8** | 85.6 | 67.5 | 67.7 | 63.5 | 58.1 | 82.1 | 68.3 |
| Gemini-1.5-pro | **59.2** | **89.0** | 72.0 | **69.9** | 65.4 | 62.4 | 87.9 | **72.3**|
| Claude-3.5 | **56.6** | 85.7 | **78.1** | **69.7** | 64.7 | 60.9 | **89.9** | **72.2**|
| Claude-3.7 | **63.0** | 86.1 | **80.1** | **69.7** | **69.2** | **65.0** | 88.4 | **74.5**|
# Model Use:
```python
import requests
from PIL import Image
from io import BytesIO
import base64
import matplotlib.pyplot as plt
from vllm import LLM, SamplingParams
QA_PROMPT = """Please answer the question using the chart image.
Question: [QUESTION]
Please first generate your reasoning process and then provide the user with the answer. Use the following format:
... your thinking process here ...
... your final answer (entity(s) or number) ...
"""
def get_image_from_url(image_url):
try:
response = requests.get(image_url, stream=True)
response.raise_for_status()
return Image.open(BytesIO(response.content))
except Exception as e:
print(f"Error with image: {e}")
return None
def get_answer(image_url, question, display=True):
image = get_image_from_url(image_url)
if display:
plt.figure(figsize=(10, 8))
plt.imshow(image)
plt.axis('off')
plt.show()
if not image:
return "Error downloading image"
buffered = BytesIO()
image.save(buffered, format=image.format or 'JPEG')
encoded_image = base64.b64encode(buffered.getvalue()).decode('utf-8')
messages = [{
"role": "user",
"content": [
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{encoded_image}"}},
{"type": "text", "text": QA_PROMPT.replace("[QUESTION]", question)}
]
}]
response = llm.chat([messages], sampling_params=SamplingParams(temperature=0, max_tokens=500))
return response[0].outputs[0].text
# Initialize the LLM
llm = LLM(
model="bespokelabs/Bespoke-MiniChart-7B",
tokenizer_mode="auto",
max_model_len=15000,
tensor_parallel_size=1,
gpu_memory_utilization=0.9,
mm_processor_kwargs={"max_pixels": 1600*28*28},
seed=2025,
trust_remote_code=True,
)
# Running inference
image_url = "https://github.com/bespokelabsai/minichart-playground-examples/blob/main/images/ilyc9wk4jf8b1.png?raw=true"
question = "How many global regions maintained their startup funding losses below 30% in 2022?"
print("\n\n=================Model Output:===============\n\n", get_answer(image_url, question))
```
# Licence
This work is licensed under [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/).
For commercial licensing, please contact company@bespokelabs.ai.
# Citation
```
@misc{bespoke_minichart_7b,
title = {Bespoke-MiniChart-7B: pushing the frontiers of open VLMs for chart understanding},
author = {Liyan Tang and Shreyas Pimpalgaonkar and Kartik Sharma and Alexandros G. Dimakis and Mahesh Sathiamoorthy and Greg Durrett},
howpublished = {blog post},
year = {2025},
url={https://huggingface.co/bespokelabs/Bespoke-MiniChart-7B},
}
```
# Acknowledgements
**Bespoke Labs** team:
- Liyan Tang
- Shreyas Pimpalgaonkar
- Kartik Sharma
- Alex Dimakis
- Mahesh Sathiamoorthy
- Greg Durrett
*Model perfected at Bespoke Labs — where careful curation meets cutting‑edge modeling.*