π Iβve always had a dream of making AI accessible to everyone, regardless of location or language. However, current open MLLMs often respond in English, even to non-English queries!
π Introducing Pangea: A Fully Open Multilingual Multimodal LLM supporting 39 languages! πβ¨
The Pangea family includes three major components: π₯ Pangea-7B: A state-of-the-art multilingual multimodal LLM capable of 39 languages! Not only does it excel in multilingual scenarios, but it also matches or surpasses English-centric models like Llama 3.2, Molmo, and LlavaOneVision in English performance.
π PangeaIns: A 6M multilingual multimodal instruction tuning dataset across 39 languages. ποΈ With 40% English instructions and 60% multilingual instructions, it spans various domains, including 1M culturally-relevant images sourced from LAION-Multi. π¨
π PangeaBench: A comprehensive evaluation benchmark featuring 14 datasets in 47 languages. Evaluation can be tricky, so we carefully curated existing benchmarks and introduced two new datasets: xChatBench (human-annotated wild queries with fine-grained evaluation criteria) and xMMMU (a meticulously machine-translated version of MMMU).