File size: 5,540 Bytes
c8e5af5
d5866e5
 
 
c8e5af5
 
d5866e5
 
 
 
 
 
1dad09f
 
 
d5866e5
aaf02c0
79f40ff
 
1dad09f
79f40ff
 
 
 
c071750
79f40ff
 
 
 
 
 
 
 
 
d5866e5
 
 
 
79f40ff
 
1dad09f
 
 
 
 
 
 
d5866e5
e354153
1dad09f
e354153
d5866e5
1dad09f
d5866e5
 
1dad09f
 
79f40ff
 
 
 
d5866e5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c01e25c
d5866e5
 
 
 
 
79f40ff
d5866e5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
---
language:
- en
pipeline_tag: text-generation
---

<p align="center">
    <img src="./Bespoke-Labs-Logo.png" width="550">
</p>

# Bespoke-MiniChart-7B

<a href="https://playground.bespokelabs.ai/minichart">
  <img src="https://cdn-uploads.huggingface.co/production/uploads/6444e4417a7b94ddc2d14e1d/g-QaXrmPLYk5m3Hq5vFtr.png" width="200px" />
</a>

This is an open‑source chart‑understanding Vision‑Language Model (VLM) developed at [Bespoke Labs](https://www.bespokelabs.ai/) and maintained by [Liyan Tang](https://www.tangliyan.com/) and Bespoke Labs. It sets a new state‑of‑the‑art in chart question‑answering (Chart‑QA) for 7 billion‑parameter models, outperforming much larger closed models such as Gemini‑1.5‑Pro and Claude‑3.5 on seven public benchmarks.

1. **Blog Post**: https://www.bespokelabs.ai/blog/bespoke-minichart-7b
2. **Playground**: https://playground.bespokelabs.ai/minichart
---

# Example Outputs

The examples below showcase how Bespoke-MiniChart-7B can perform both visual perception and textual reasoning.


<p align="left">
    <img src="https://cdn-uploads.huggingface.co/production/uploads/6444e4417a7b94ddc2d14e1d/E5WGhi_fVNzCsrKeNeIs3.png" width="700">
</p>

<p align="left">
    <img src="https://cdn-uploads.huggingface.co/production/uploads/6444e4417a7b94ddc2d14e1d/bYKXRm3sfOdX3zd_5qUpK.png" width="700">
</p>


# Model Performance

Bespoke-MiniChart-7B achieves state-of-the-art performance on chart understanding among models with similar sizes. In addition to that, the model can even surpass closed-models such as Gemini-1.5-Pro and Claude-3.5.

<p align="left">
    <img src="https://cdn-uploads.huggingface.co/production/uploads/6444e4417a7b94ddc2d14e1d/5pejAyzPG_tRBU6FwH7PA.png" width="700">
</p>

We also compare the performance of our model finetuned using SFT+DPO vs SFT only. 

In the table below, M1 and M2 are finetuned models with 270K and 1M SFT examples respsectively, and Bespoke-MiniChart-7B is the model finetuned using SFT+DPO. 

<p align="left">
    <img src="https://cdn-uploads.huggingface.co/production/uploads/6444e4417a7b94ddc2d14e1d/WRsPs437niUrXmYtkRajG.png" width="700">
</p>


# Model Use:

[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1FEmlwGgn9209iQO-rs2-9UHPLoytwZMH?usp=sharing)

The model is available on the playground here: https://playground.bespokelabs.ai/minichart

You can also run the model with the following snippet: 

```python
import requests
from PIL import Image
from io import BytesIO
import base64
import matplotlib.pyplot as plt
from vllm import LLM, SamplingParams

QA_PROMPT = """Please answer the question using the chart image.

Question: [QUESTION]

Please first generate your reasoning process and then provide the user with the answer. Use the following format:

<think> 
... your thinking process here ... 
</think> 
<answer> 
... your final answer (entity(s) or number) ...
</answer>"""

def get_image_from_url(image_url):
    try:
        response = requests.get(image_url, stream=True)
        response.raise_for_status()
        return Image.open(BytesIO(response.content))
    except Exception as e:
        print(f"Error with image: {e}")
        return None

def get_answer(image_url, question, display=True):
    image = get_image_from_url(image_url)

    if display:
      plt.figure(figsize=(10, 8))
      plt.imshow(image)
      plt.axis('off')
      plt.show()

    if not image:
        return "Error downloading image" 
  
    buffered = BytesIO()
    image.save(buffered, format=image.format or 'JPEG')
    encoded_image = base64.b64encode(buffered.getvalue()).decode('utf-8')
    
    messages = [{
        "role": "user",
        "content": [
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{encoded_image}"}},
            {"type": "text", "text": QA_PROMPT.replace("[QUESTION]", question)}
        ]
    }]

    response = llm.chat([messages], sampling_params=SamplingParams(temperature=0, max_tokens=500))
    return response[0].outputs[0].text
    
# Initialize the LLM
llm = LLM(
    model="bespokelabs/Bespoke-MiniChart-7B",
    tokenizer_mode="auto",
    max_model_len=15000,
    tensor_parallel_size=1,
    gpu_memory_utilization=0.9,
    mm_processor_kwargs={"max_pixels": 1600*28*28},
    seed=2025,
    trust_remote_code=True,
)

# Running inference
image_url = "https://github.com/bespokelabsai/minichart-playground-examples/blob/main/images/ilyc9wk4jf8b1.png?raw=true"
question = "How many global regions maintained their startup funding losses below 30% in 2022?"

print("\n\n=================Model Output:===============\n\n", get_answer(image_url, question))
```

---
# Licence

This work is licensed under [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/).
For commercial licensing, please contact [email protected].

# Citation

```
@misc{bespoke_minichart_7b,
  title  = {Bespoke-MiniChart-7B: pushing the frontiers of open VLMs for chart understanding},
  author = {Liyan Tang and Shreyas Pimpalgaonkar and Kartik Sharma and Alexandros G. Dimakis and Mahesh Sathiamoorthy and Greg Durrett},
  howpublished = {blog post},
  year   = {2025},
  url={https://huggingface.co/bespokelabs/Bespoke-MiniChart-7B},
}
```

# Acknowledgements

**Bespoke Labs** team:

- Liyan Tang
- Shreyas Pimpalgaonkar
- Kartik Sharma
- Alex Dimakis
- Mahesh Sathiamoorthy
- Greg Durrett


*Model perfected at Bespoke Labs — where careful curation meets cutting‑edge modeling.*