---
language:
- en
pipeline_tag: text-generation
---

<p align="center">
    <img src="./Bespoke-Labs-Logo.png" width="550">
</p>

# Bespoke-MiniChart-7B

[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1FEmlwGgn9209iQO-rs2-9UHPLoytwZMH?usp=sharing)

This is an open‑source chart‑understanding Vision‑Language Model (VLM) developed at [Bespoke Labs](https://www.bespokelabs.ai/) and maintained by [Liyan Tang](https://www.tangliyan.com/) and Bespoke Labs. It sets a new state‑of‑the‑art in chart question‑answering (Chart‑QA) for 7 billion‑parameter models, outperforming much larger closed models such as Gemini‑1.5‑Pro and Claude‑3.5 on seven public benchmarks.

Please check our blog for more information about how we trained the model <Blog Post Link>

# Model Performance

Our model achieves state-of-the-art performance on chart understanding among models with similar sizes. In addition to that, our models can even surpass closed-models such as Gemini-1.5-Pro and Claude-3.5.

| Model / Category                       | ChartQAPro (1637) | ChartQA (2500) | EvoChart (1250) | CharXiv (4000) | ChartX (1152) | ChartBench (2100) | MMC (808) | Average |
|----------------------------------------|------------------:|---------------:|----------------:|---------------:|--------------:|------------------:|----------:|--------:|
| **Open-Models&nbsp;(11 B and less)**   |                   |                |                 |                |               |                   |           |         |
| InternVL-2.5-8B                        |                  –| **78.2**       | 53.0            | 55.7           | 49.5          | 44.7              | **85.5**  | –       |
| Qwen2-VL-7B                            |                  –| **82.1**       | 54.5            | 53.5           | 50.8          | 50.8              | **83.9**  | –       |
| Qwen2.5-VL-7B                          | **53.5**          | **86.0**       | 67.9            | 60.9           | 67.0          | 61.4              | **86.0**  | **69.0**|
| **Ours**                               |                   |                |                 |                |               |                   |           |         |
| Bespoke-MiniChart-7B                   | **56.7**          | **89.5**       | **71.8**        | **66.4**       | **68.9**      | **66.1**          | **88.4**  | **72.5**|
| **Open-Models&nbsp;(32 B and more)**   |                   |                |                 |                |               |                   |           |         |
| QVQ-72B-Preview                        |                  –| **84.2**       | 65.0            | 59.0           | 60.9          | 53.8              | **83.4**  | –       |
| Qwen2.5-VL-32B                         | **58.4**          | **89.5**       | 74.3            | 66.9           | 64.5          | 59.8              | **89.6**  | **71.9**|
| Qwen2.5-VL-72B                         | **59.0**          | **90.0**       | **76.8**        | 67.1           | **67.2**      | 61.5              | **91.2**  | **73.3**|
| **Closed-Models**                      |                   |                |                 |                |               |                   |           |         |
| GPT-4o                                 | **53.6**          | 85.7           | 71.7            | 67.8           | 54.3          | 46.1              | **89.1**  | 66.9    |
| Gemini-1.5-flash                       | **53.8**          | 85.6           | 67.5            | 67.7           | 63.5          | 58.1              | 82.1      | 68.3    |
| Gemini-1.5-pro                         | **59.2**          | **89.0**       | 72.0            | **69.9**       | 65.4          | 62.4              | 87.9      | **72.3**|
| Claude-3.5                             | **56.6**          | 85.7           | **78.1**        | **69.7**       | 64.7          | 60.9              | **89.9**  | **72.2**|
| Claude-3.7                             | **63.0**          | 86.1           | **80.1**        | **69.7**       | **69.2**      | **65.0**          | 88.4      | **74.5**|


# Model Use:

```python
import requests
from PIL import Image
from io import BytesIO
import base64
import matplotlib.pyplot as plt
from vllm import LLM, SamplingParams

QA_PROMPT = """Please answer the question using the chart image.

Question: [QUESTION]

Please first generate your reasoning process and then provide the user with the answer. Use the following format:

<think> 
... your thinking process here ... 
</think> 
<answer> 
... your final answer (entity(s) or number) ...
</answer>"""

def get_image_from_url(image_url):
    try:
        response = requests.get(image_url, stream=True)
        response.raise_for_status()
        return Image.open(BytesIO(response.content))
    except Exception as e:
        print(f"Error with image: {e}")
        return None

def get_answer(image_url, question, display=True):
    image = get_image_from_url(image_url)

    if display:
      plt.figure(figsize=(10, 8))
      plt.imshow(image)
      plt.axis('off')
      plt.show()

    if not image:
        return "Error downloading image" 
  
    buffered = BytesIO()
    image.save(buffered, format=image.format or 'JPEG')
    encoded_image = base64.b64encode(buffered.getvalue()).decode('utf-8')
    
    messages = [{
        "role": "user",
        "content": [
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{encoded_image}"}},
            {"type": "text", "text": QA_PROMPT.replace("[QUESTION]", question)}
        ]
    }]

    response = llm.chat([messages], sampling_params=SamplingParams(temperature=0, max_tokens=500))
    return response[0].outputs[0].text
    
# Initialize the LLM
llm = LLM(
    model="bespokelabs/Bespoke-MiniChart-7B",
    tokenizer_mode="auto",
    max_model_len=15000,
    tensor_parallel_size=1,
    gpu_memory_utilization=0.9,
    mm_processor_kwargs={"max_pixels": 1600*28*28},
    seed=2025,
    trust_remote_code=True,
)

# Running inference
image_url = "https://github.com/bespokelabsai/minichart-playground-examples/blob/main/images/ilyc9wk4jf8b1.png?raw=true"
question = "How many global regions maintained their startup funding losses below 30% in 2022?"

print("\n\n=================Model Output:===============\n\n", get_answer(image_url, question))
```

# Licence

This work is licensed under [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/).
For commercial licensing, please contact company@bespokelabs.ai.

# Citation

```
@misc{bespoke_minichart_7b,
  title  = {Bespoke-MiniChart-7B: pushing the frontiers of open VLMs for chart understanding},
  author = {Liyan Tang and Shreyas Pimpalgaonkar and Kartik Sharma and Alexandros G. Dimakis and Mahesh Sathiamoorthy and Greg Durrett},
  howpublished = {blog post},
  year   = {2025},
  url={https://huggingface.co/bespokelabs/Bespoke-MiniChart-7B},
}
```

# Acknowledgements

**Bespoke Labs** team:

- Liyan Tang
- Shreyas Pimpalgaonkar
- Kartik Sharma
- Alex Dimakis
- Mahesh Sathiamoorthy
- Greg Durrett


*Model perfected at Bespoke Labs — where careful curation meets cutting‑edge modeling.*