Visual Question Answering

The following content is currently a work in progress and does not represent the final quality.

Alignment for the multilingual VQA tasks is being conducted on blip2-flan-t5-xxl and Guanaco using only Linear Layers.

The latest weight file is provided here, based on the implementation of MiniGPT-4.

This model supports English, Chinese, Japanese, and German languages and requires the combined use of the Guanaco 7B LLM model.

A portion of the dataset has already been released.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.