pimpalgaonkar's picture
Update README.md
c01e25c verified
|
raw
history blame
7.26 kB
metadata
language:
  - en
pipeline_tag: text-generation

Bespoke-MiniChart-7B

Colab

This is an open‑source chart‑understanding Vision‑Language Model (VLM) developed at Bespoke Labs and maintained by Liyan Tang and Bespoke Labs. It sets a new state‑of‑the‑art in chart question‑answering (Chart‑QA) for 7 billion‑parameter models, outperforming much larger closed models such as Gemini‑1.5‑Pro and Claude‑3.5 on seven public benchmarks.

Please check our blog for more information about how we trained the model

Model Performance

Our model achieves state-of-the-art performance on chart understanding among models with similar sizes. In addition to that, our models can even surpass closed-models such as Gemini-1.5-Pro and Claude-3.5.

Model / Category ChartQAPro (1637) ChartQA (2500) EvoChart (1250) CharXiv (4000) ChartX (1152) ChartBench (2100) MMC (808) Average
Open-Models (11 B and less)
InternVL-2.5-8B 78.2 53.0 55.7 49.5 44.7 85.5
Qwen2-VL-7B 82.1 54.5 53.5 50.8 50.8 83.9
Qwen2.5-VL-7B 53.5 86.0 67.9 60.9 67.0 61.4 86.0 69.0
Ours
Bespoke-MiniChart-7B 56.7 89.5 71.8 66.4 68.9 66.1 88.4 72.5
Open-Models (32 B and more)
QVQ-72B-Preview 84.2 65.0 59.0 60.9 53.8 83.4
Qwen2.5-VL-32B 58.4 89.5 74.3 66.9 64.5 59.8 89.6 71.9
Qwen2.5-VL-72B 59.0 90.0 76.8 67.1 67.2 61.5 91.2 73.3
Closed-Models
GPT-4o 53.6 85.7 71.7 67.8 54.3 46.1 89.1 66.9
Gemini-1.5-flash 53.8 85.6 67.5 67.7 63.5 58.1 82.1 68.3
Gemini-1.5-pro 59.2 89.0 72.0 69.9 65.4 62.4 87.9 72.3
Claude-3.5 56.6 85.7 78.1 69.7 64.7 60.9 89.9 72.2
Claude-3.7 63.0 86.1 80.1 69.7 69.2 65.0 88.4 74.5

Model Use:

import requests
from PIL import Image
from io import BytesIO
import base64
import matplotlib.pyplot as plt
from vllm import LLM, SamplingParams

QA_PROMPT = """Please answer the question using the chart image.

Question: [QUESTION]

Please first generate your reasoning process and then provide the user with the answer. Use the following format:

<think> 
... your thinking process here ... 
</think> 
<answer> 
... your final answer (entity(s) or number) ...
</answer>"""

def get_image_from_url(image_url):
    try:
        response = requests.get(image_url, stream=True)
        response.raise_for_status()
        return Image.open(BytesIO(response.content))
    except Exception as e:
        print(f"Error with image: {e}")
        return None

def get_answer(image_url, question, display=True):
    image = get_image_from_url(image_url)

    if display:
      plt.figure(figsize=(10, 8))
      plt.imshow(image)
      plt.axis('off')
      plt.show()

    if not image:
        return "Error downloading image" 
  
    buffered = BytesIO()
    image.save(buffered, format=image.format or 'JPEG')
    encoded_image = base64.b64encode(buffered.getvalue()).decode('utf-8')
    
    messages = [{
        "role": "user",
        "content": [
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{encoded_image}"}},
            {"type": "text", "text": QA_PROMPT.replace("[QUESTION]", question)}
        ]
    }]

    response = llm.chat([messages], sampling_params=SamplingParams(temperature=0, max_tokens=500))
    return response[0].outputs[0].text
    
# Initialize the LLM
llm = LLM(
    model="bespokelabs/Bespoke-MiniChart-7B",
    tokenizer_mode="auto",
    max_model_len=15000,
    tensor_parallel_size=1,
    gpu_memory_utilization=0.9,
    mm_processor_kwargs={"max_pixels": 1600*28*28},
    seed=2025,
    trust_remote_code=True,
)

# Running inference
image_url = "https://github.com/bespokelabsai/minichart-playground-examples/blob/main/images/ilyc9wk4jf8b1.png?raw=true"
question = "How many global regions maintained their startup funding losses below 30% in 2022?"

print("\n\n=================Model Output:===============\n\n", get_answer(image_url, question))

Licence

This work is licensed under CC BY-NC 4.0. For commercial licensing, please contact [email protected].

Citation

@misc{bespoke_minichart_7b,
  title  = {Bespoke-MiniChart-7B: pushing the frontiers of open VLMs for chart understanding},
  author = {Liyan Tang and Shreyas Pimpalgaonkar and Kartik Sharma and Alexandros G. Dimakis and Mahesh Sathiamoorthy and Greg Durrett},
  howpublished = {blog post},
  year   = {2025},
  url={https://huggingface.co/bespokelabs/Bespoke-MiniChart-7B},
}

Acknowledgements

Bespoke Labs team:

  • Liyan Tang
  • Shreyas Pimpalgaonkar
  • Kartik Sharma
  • Alex Dimakis
  • Mahesh Sathiamoorthy
  • Greg Durrett

Model perfected at Bespoke Labs — where careful curation meets cutting‑edge modeling.