INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages
Evaluation Comparison
Intent Detection (Accuracy)
In-language training
Model |
eng |
amh |
ewe |
hau |
ibo |
kin |
lin |
lug |
orm |
sna |
sot |
swa |
twi |
wol |
xho |
yor |
zul |
AVG |
mT5-Large |
80.5 |
91.5 |
77.3 |
94.6 |
92.9 |
83.7 |
91.3 |
83.3 |
73.3 |
92.6 |
80.2 |
95.8 |
85.3 |
91.6 |
95.8 |
90.9 |
82.4 |
87.7±4.1 |
AfriTeVa V2 (T5) |
81.6 |
93.2 |
84.4 |
98.9 |
95.7 |
87.8 |
91.6 |
86.8 |
86.6 |
94.6 |
85.7 |
96.8 |
87.1 |
94.0 |
97.3 |
97.0 |
89.2 |
91.7±2.7 |
NLLB LLM2Vec |
88.4 |
94.2 |
87.8 |
98.3 |
96.8 |
89.2 |
95.2 |
93.2 |
86.2 |
96.1 |
87.3 |
97.4 |
93.5 |
95.6 |
97.5 |
97.3 |
89.1 |
93.4±2.3 |
XLM-RoBERTa |
83.5 |
92.9 |
77.9 |
96.0 |
88.8 |
69.6 |
90.5 |
78.9 |
75.0 |
83.8 |
76.0 |
96.7 |
79.5 |
90.2 |
89.6 |
92.6 |
74.7 |
84.5±4.9 |
AfriBERTa V2 |
74.2 |
91.2 |
78.3 |
98.2 |
93.8 |
83.1 |
91.0 |
83.8 |
78.8 |
89.5 |
81.9 |
96.0 |
83.2 |
92.3 |
94.4 |
95.0 |
86.7 |
88.6±3.5 |
AfroXLMR |
84.1 |
95.3 |
84.6 |
98.3 |
96.0 |
88.2 |
93.3 |
85.2 |
88.3 |
95.3 |
85.5 |
97.8 |
88.8 |
95.8 |
97.3 |
96.1 |
89.0 |
92.2±3.0 |
AfroXLMR 76L |
84.5 |
95.5 |
90.4 |
98.7 |
96.3 |
89.4 |
94.6 |
91.3 |
88.3 |
95.1 |
86.8 |
98.1 |
93.6 |
96.2 |
96.9 |
97.7 |
89.8 |
93.7±2.1 |
Multi-lingual training
Model |
eng |
amh |
ewe |
hau |
ibo |
kin |
lin |
lug |
orm |
sna |
sot |
swa |
twi |
wol |
xho |
yor |
zul |
AVG |
AfroXLMR-large-76L-Injongo-intent |
89.0 |
96.0 |
92.6 |
99.2 |
96.6 |
87.7 |
95.9 |
92.3 |
92.9 |
96.5 |
87.6 |
97.8 |
94.2 |
97.1 |
97.3 |
97.9 |
89.2 |
94.4±2.0 |
Slot Filling (F1)
In-language training
Model |
eng |
amh |
ewe |
hau |
ibo |
kin |
lin |
lug |
orm |
sna |
sot |
swa |
twi |
wol |
xho |
yor |
zul |
AVG |
mT5-Large |
73.7 |
80.9 |
71.6 |
89.4 |
80.5 |
74.2 |
82.6 |
78.9 |
72.1 |
81.1 |
74.7 |
88.1 |
79.0 |
76.9 |
88.4 |
78.9 |
68.3 |
79.1±3.7 |
AfriTeVa V2 (T5) |
73.6 |
80.9 |
74.5 |
93.8 |
79.9 |
76.6 |
87.1 |
85.2 |
79.0 |
82.1 |
77.5 |
88.9 |
84.0 |
79.0 |
90.0 |
87.2 |
71.2 |
82.3±3.3 |
NLLB LLM2Vec |
74.6 |
82.4 |
80.5 |
93.6 |
78.1 |
70.1 |
84.8 |
86.6 |
80.8 |
81.4 |
74.8 |
85.7 |
85.7 |
78.3 |
88.0 |
85.0 |
78.3 |
82.1±3.1 |
XLM-RoBERTa |
77.9 |
84.8 |
79.9 |
93.9 |
76.6 |
69.3 |
86.3 |
83.8 |
83.8 |
79.3 |
71.7 |
88.7 |
84.2 |
79.3 |
89.1 |
83.9 |
79.4 |
82.1±3.5 |
AfriBERTa V2 |
70.7 |
82.2 |
77.9 |
93.7 |
78.3 |
73.8 |
84.4 |
84.1 |
81.0 |
81.8 |
73.5 |
87.6 |
81.9 |
78.3 |
88.5 |
86.2 |
79.6 |
82.1±2.9 |
AfroXLMR |
79.0 |
86.2 |
81.6 |
95.1 |
82.0 |
76.3 |
87.1 |
88.5 |
84.9 |
84.9 |
77.5 |
90.2 |
85.5 |
81.7 |
91.1 |
87.3 |
82.5 |
85.2±2.7 |
AfroXLMR 76L |
78.7 |
86.3 |
84.5 |
94.3 |
81.9 |
76.7 |
88.0 |
88.8 |
85.5 |
84.9 |
77.4 |
90.2 |
89.8 |
81.3 |
90.5 |
88.1 |
81.3 |
85.6±2.7 |
Multi-lingual training
Model |
eng |
amh |
ewe |
hau |
ibo |
kin |
lin |
lug |
orm |
sna |
sot |
swa |
twi |
wol |
xho |
yor |
zul |
AVG |
AfroXLMR-large-76L-Injongo-slot |
82.4 |
88.2 |
87.0 |
96.3 |
84.0 |
79.3 |
90.3 |
89.2 |
87.2 |
86.1 |
80.4 |
90.5 |
90.3 |
83.3 |
91.8 |
90.2 |
83.3 |
87.3±2.4 |
Language Codes
- eng: English
- amh: Amharic
- ewe: Ewe
- hau: Hausa
- ibo: Igbo
- kin: Kinyarwanda
- lin: Lingala
- lug: Luganda
- orm: Oromo
- sna: Shona
- sot: Sesotho
- swa: Swahili
- twi: Twi
- wol: Wolof
- xho: Xhosa
- yor: Yoruba
- zul: Zulu
Notes
- Bold values indicate the best performing scores in each category
- The highlighted models (AfroXLMR 76L) show the top overall performance
- Multi-lingual training generally outperforms in-language training
- Standard deviations are reported alongside average scores
- AVG doest not include english results.
Citation
@misc{yu2025injongo,
title={INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages},
author={Hao Yu and Jesujoba O. Alabi and Andiswa Bukula and Jian Yun Zhuang and En-Shiun Annie Lee and Tadesse Kebede Guge and Israel Abebe Azime and Happy Buzaaba and Blessing Kudzaishe Sibanda and Godson K. Kalipe and Jonathan Mukiibi and Salomon Kabongo Kabenamualu and Mmasibidi Setaka and Lolwethu Ndolela and Nkiruka Odu and Rooweither Mabuya and Shamsuddeen Hassan Muhammad and Salomey Osei and Sokhar Samb and Juliet W. Murage and Dietrich Klakow and David Ifeoluwa Adelani},
year={2025},
eprint={2502.09814},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.09814},
}
@misc{adelani2023sib200,
title={SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects},
author={David Ifeoluwa Adelani and Hannah Liu and Xiaoyu Shen and Nikita Vassilyev and Jesujoba O. Alabi and Yanke Mao and Haonan Gao and Annie En-Shiun Lee},
year={2023},
eprint={2309.07445},
archivePrefix={arXiv},
primaryClass={cs.CL}
}