INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages

Evaluation Comparison

Zero-Shot Performance of LLMs on Intent Detection and Slot Filling

Intent Detection

Evaluation based on accuracy. Average computed on five templates, and on only African languages.

Model eng amh ewe hau ibo kin lin lug orm sna sot swa twi wol xho yor zul AVG
Llama 3.1 8B 27.6 1.9 2.1 4.8 5.5 3.3 5.3 2.4 1.6 2.8 2.9 14.1 2.6 4.0 3.2 3.5 2.8 3.9±2.4
Gemma 2 9B 77.6 49.2 6.1 40.8 31.5 23.8 22.2 23.2 7.7 29.7 19.9 70.0 21.0 13.8 40.1 32.2 36.3 29.2±8.7
Aya-101 13B 65.3 62.9 13.4 57.8 56.9 40.4 27.8 33.9 20.8 51.2 43.9 65.9 27.2 19.7 58.1 45.9 53.2 42.4±9.1
Gemma 2 27B 79.5 47.2 6.3 46.5 36.9 26.7 27.5 26.1 5.8 36.7 25.6 75.5 21.2 16.4 50.2 34.8 44.3 33.0±9.6
Llama 3.3 70B 81.1 56.2 9.5 52.3 52.4 35.0 37.5 37.7 12.4 32.3 30.5 80.6 29.3 20.9 43.5 41.4 43.9 38.5±9.5
Gemini 1.5 Pro 81.8 77.9 24.3 74.8 65.4 61.5 54.6 59.3 39.3 68.6 51.6 83.2 47.2 25.6 76.2 66.8 68.7 59.1±9.6
GPT-4o (Aug) 80.9 76.0 15.1 80.7 71.8 64.7 56.4 68.2 59.3 75.5 59.7 84.5 58.6 43.7 79.6 77.0 71.2 65.1±9.3
Gemma 2 9B IT (SFT) 81.2 83.3 77.1 89.8 86.7 78.6 85.8 83.6 84.6 87.7 76.8 88.8 82.6 85.1 89.1 87.9 78.9 84.1

Slot Filling

Evaluation based on F1-score. Average computed on five templates, and on only African languages.

Model eng amh ewe hau ibo kin lin lug orm sna sot swa twi wol xho yor zul AVG
Llama 3.1 8B 25.0 3.7 5.6 11.1 12.6 8.5 9.1 10.1 2.8 9.9 11.5 17.3 11.2 9.2 2.6 11.0 9.0 9.1±2.2
Gemma 2 IT 9B 34.1 4.5 0.3 7.4 10.6 5.0 6.0 5.6 0.1 7.3 10.8 21.2 2.4 2.6 2.2 5.2 8.2 6.2±2.9
Aya-101 13B 21.4 8.2 7.9 11.8 14.6 12.2 9.4 15.5 3.6 15.0 17.0 16.2 13.8 14.0 2.8 9.6 10.6 11.4±2.4
Gemma 2 IT 27B 49.8 15.7 9.5 24.1 25.2 21.7 15.2 28.4 2.6 29.8 28.0 40.2 24.3 23.3 4.5 28.1 31.0 22.0±5.8
Llama 3.3 70B Instruct 52.6 26.3 22.0 29.5 35.0 31.4 25.0 30.4 9.3 29.5 36.4 40.7 35.6 36.4 6.9 34.2 31.9 28.8±5.2
Gemini 1.5 Pro 52.8 15.2 18.7 31.9 35.8 34.4 34.9 34.4 12.2 36.8 43.0 37.5 34.5 34.2 6.9 33.2 38.6 30.1±6.1
GPT-4o (Aug) 55.4 22.8 19.4 37.8 38.9 36.4 33.5 35.3 13.0 40.2 40.9 46.5 40.1 37.9 10.0 42.4 37.6 33.3±6.0
Gemma 2 9B IT (SFT) 80.6 80.7 82.0 92.2 81.3 75.5 88.5 85.8 81.1 82.5 77.2 87.7 86.3 82.9 89.6 88.4 68.8 83.1

Bold values indicate the best performance for each language/metric.

Language Codes

  • eng: English
  • amh: Amharic
  • ewe: Ewe
  • hau: Hausa
  • ibo: Igbo
  • kin: Kinyarwanda
  • lin: Lingala
  • lug: Luganda
  • orm: Oromo
  • sna: Shona
  • sot: Sesotho
  • swa: Swahili
  • twi: Twi
  • wol: Wolof
  • xho: Xhosa
  • yor: Yoruba
  • zul: Zulu

Citation

@misc{yu2025injongo,
      title={INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages}, 
      author={Hao Yu and Jesujoba O. Alabi and Andiswa Bukula and Jian Yun Zhuang and En-Shiun Annie Lee and Tadesse Kebede Guge and Israel Abebe Azime and Happy Buzaaba and Blessing Kudzaishe Sibanda and Godson K. Kalipe and Jonathan Mukiibi and Salomon Kabongo Kabenamualu and Mmasibidi Setaka and Lolwethu Ndolela and Nkiruka Odu and Rooweither Mabuya and Shamsuddeen Hassan Muhammad and Salomey Osei and Sokhar Samb and Juliet W. Murage and Dietrich Klakow and David Ifeoluwa Adelani},
      year={2025},
      eprint={2502.09814},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.09814}, 
}
@misc{adelani2023sib200,
      title={SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects}, 
      author={David Ifeoluwa Adelani and Hannah Liu and Xiaoyu Shen and Nikita Vassilyev and Jesujoba O. Alabi and Yanke Mao and Haonan Gao and Annie En-Shiun Lee},
      year={2023},
      eprint={2309.07445},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
Downloads last month
8
Safetensors
Model size
9.24B params
Tensor type
FP16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for McGill-NLP/gemma-2-9b-it-Injongo-intent

Base model

google/gemma-2-9b
Finetuned
(248)
this model

Dataset used to train McGill-NLP/gemma-2-9b-it-Injongo-intent

Collection including McGill-NLP/gemma-2-9b-it-Injongo-intent