INJONGO
Collection
INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages
•
4 items
•
Updated
Zero-Shot Performance of LLMs on Intent Detection and Slot Filling
Evaluation based on accuracy. Average computed on five templates, and on only African languages.
Model | eng | amh | ewe | hau | ibo | kin | lin | lug | orm | sna | sot | swa | twi | wol | xho | yor | zul | AVG |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Llama 3.1 8B | 27.6 | 1.9 | 2.1 | 4.8 | 5.5 | 3.3 | 5.3 | 2.4 | 1.6 | 2.8 | 2.9 | 14.1 | 2.6 | 4.0 | 3.2 | 3.5 | 2.8 | 3.9±2.4 |
Gemma 2 9B | 77.6 | 49.2 | 6.1 | 40.8 | 31.5 | 23.8 | 22.2 | 23.2 | 7.7 | 29.7 | 19.9 | 70.0 | 21.0 | 13.8 | 40.1 | 32.2 | 36.3 | 29.2±8.7 |
Aya-101 13B | 65.3 | 62.9 | 13.4 | 57.8 | 56.9 | 40.4 | 27.8 | 33.9 | 20.8 | 51.2 | 43.9 | 65.9 | 27.2 | 19.7 | 58.1 | 45.9 | 53.2 | 42.4±9.1 |
Gemma 2 27B | 79.5 | 47.2 | 6.3 | 46.5 | 36.9 | 26.7 | 27.5 | 26.1 | 5.8 | 36.7 | 25.6 | 75.5 | 21.2 | 16.4 | 50.2 | 34.8 | 44.3 | 33.0±9.6 |
Llama 3.3 70B | 81.1 | 56.2 | 9.5 | 52.3 | 52.4 | 35.0 | 37.5 | 37.7 | 12.4 | 32.3 | 30.5 | 80.6 | 29.3 | 20.9 | 43.5 | 41.4 | 43.9 | 38.5±9.5 |
Gemini 1.5 Pro | 81.8 | 77.9 | 24.3 | 74.8 | 65.4 | 61.5 | 54.6 | 59.3 | 39.3 | 68.6 | 51.6 | 83.2 | 47.2 | 25.6 | 76.2 | 66.8 | 68.7 | 59.1±9.6 |
GPT-4o (Aug) | 80.9 | 76.0 | 15.1 | 80.7 | 71.8 | 64.7 | 56.4 | 68.2 | 59.3 | 75.5 | 59.7 | 84.5 | 58.6 | 43.7 | 79.6 | 77.0 | 71.2 | 65.1±9.3 |
Gemma 2 9B IT (SFT) | 81.2 | 83.3 | 77.1 | 89.8 | 86.7 | 78.6 | 85.8 | 83.6 | 84.6 | 87.7 | 76.8 | 88.8 | 82.6 | 85.1 | 89.1 | 87.9 | 78.9 | 84.1 |
Evaluation based on F1-score. Average computed on five templates, and on only African languages.
Model | eng | amh | ewe | hau | ibo | kin | lin | lug | orm | sna | sot | swa | twi | wol | xho | yor | zul | AVG |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Llama 3.1 8B | 25.0 | 3.7 | 5.6 | 11.1 | 12.6 | 8.5 | 9.1 | 10.1 | 2.8 | 9.9 | 11.5 | 17.3 | 11.2 | 9.2 | 2.6 | 11.0 | 9.0 | 9.1±2.2 |
Gemma 2 IT 9B | 34.1 | 4.5 | 0.3 | 7.4 | 10.6 | 5.0 | 6.0 | 5.6 | 0.1 | 7.3 | 10.8 | 21.2 | 2.4 | 2.6 | 2.2 | 5.2 | 8.2 | 6.2±2.9 |
Aya-101 13B | 21.4 | 8.2 | 7.9 | 11.8 | 14.6 | 12.2 | 9.4 | 15.5 | 3.6 | 15.0 | 17.0 | 16.2 | 13.8 | 14.0 | 2.8 | 9.6 | 10.6 | 11.4±2.4 |
Gemma 2 IT 27B | 49.8 | 15.7 | 9.5 | 24.1 | 25.2 | 21.7 | 15.2 | 28.4 | 2.6 | 29.8 | 28.0 | 40.2 | 24.3 | 23.3 | 4.5 | 28.1 | 31.0 | 22.0±5.8 |
Llama 3.3 70B Instruct | 52.6 | 26.3 | 22.0 | 29.5 | 35.0 | 31.4 | 25.0 | 30.4 | 9.3 | 29.5 | 36.4 | 40.7 | 35.6 | 36.4 | 6.9 | 34.2 | 31.9 | 28.8±5.2 |
Gemini 1.5 Pro | 52.8 | 15.2 | 18.7 | 31.9 | 35.8 | 34.4 | 34.9 | 34.4 | 12.2 | 36.8 | 43.0 | 37.5 | 34.5 | 34.2 | 6.9 | 33.2 | 38.6 | 30.1±6.1 |
GPT-4o (Aug) | 55.4 | 22.8 | 19.4 | 37.8 | 38.9 | 36.4 | 33.5 | 35.3 | 13.0 | 40.2 | 40.9 | 46.5 | 40.1 | 37.9 | 10.0 | 42.4 | 37.6 | 33.3±6.0 |
Gemma 2 9B IT (SFT) | 80.6 | 80.7 | 82.0 | 92.2 | 81.3 | 75.5 | 88.5 | 85.8 | 81.1 | 82.5 | 77.2 | 87.7 | 86.3 | 82.9 | 89.6 | 88.4 | 68.8 | 83.1 |
Bold values indicate the best performance for each language/metric.
@misc{yu2025injongo,
title={INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages},
author={Hao Yu and Jesujoba O. Alabi and Andiswa Bukula and Jian Yun Zhuang and En-Shiun Annie Lee and Tadesse Kebede Guge and Israel Abebe Azime and Happy Buzaaba and Blessing Kudzaishe Sibanda and Godson K. Kalipe and Jonathan Mukiibi and Salomon Kabongo Kabenamualu and Mmasibidi Setaka and Lolwethu Ndolela and Nkiruka Odu and Rooweither Mabuya and Shamsuddeen Hassan Muhammad and Salomey Osei and Sokhar Samb and Juliet W. Murage and Dietrich Klakow and David Ifeoluwa Adelani},
year={2025},
eprint={2502.09814},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.09814},
}
@misc{adelani2023sib200,
title={SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects},
author={David Ifeoluwa Adelani and Hannah Liu and Xiaoyu Shen and Nikita Vassilyev and Jesujoba O. Alabi and Yanke Mao and Haonan Gao and Annie En-Shiun Lee},
year={2023},
eprint={2309.07445},
archivePrefix={arXiv},
primaryClass={cs.CL}
}