mena-open-data
's Collections
Arabic NLP datasets
updated
lightonai/nanobeir-multilingual
Viewer
•
Updated
•
522k
•
693
•
11
Viewer
•
Updated
•
47.8M
•
42.8k
•
31
Viewer
•
Updated
•
2.72k
•
23
•
1
Viewer
•
Updated
•
7.42k
•
17
•
2
Viewer
•
Updated
•
149
•
4
Viewer
•
Updated
•
4.13k
•
240
•
1
Omartificial-Intelligence-Space/Arabic-NLi-Pair-Class
Viewer
•
Updated
•
981k
•
208
•
2
malaysia-ai/Multilingual-TTS
Viewer
•
Updated
•
37.1M
•
2k
•
14
opendatalab/WanJuanSiLu-Multimodal-5Languages
Preview
•
Updated
•
34
•
4
Preview
•
Updated
•
49
•
35
Viewer
•
Updated
•
66k
•
71
•
11
LLaMAX/BenchMAX_Function_Completion
Viewer
•
Updated
•
2.79k
•
139
•
1
Viewer
•
Updated
•
8.86k
•
38
•
8
Viewer
•
Updated
•
3.25M
•
33
•
3
MLCommons/ml_spoken_words
Updated
•
746
•
35
Twitter/HashtagPrediction
Viewer
•
Updated
•
1.07M
•
60
•
2
Viewer
•
Updated
•
1.4M
•
255
•
1
Viewer
•
Updated
•
3.62M
•
777
•
2
Viewer
•
Updated
•
197k
•
386
•
4
Viewer
•
Updated
•
54.9k
•
4.74k
•
83
Viewer
•
Updated
•
108k
•
2.06k
•
66
Updated
•
2.17k
•
16
Viewer
•
Updated
•
624
•
12
•
4
Viewer
•
Updated
•
5.07k
•
151
Viewer
•
Updated
•
13.3k
•
36
•
5
Viewer
•
Updated
•
200
•
76
Viewer
•
Updated
•
37.4k
•
251
•
4
Updated
•
229
•
4
Viewer
•
Updated
•
130k
•
70
•
2
Viewer
•
Updated
•
3.12k
•
148
vg055/SemEval2025_Task11_TrackA
Viewer
•
Updated
•
2k
•
10
sarulab-speech/commonvoice22_sidon
Viewer
•
Updated
•
15.1M
•
941
•
16
Preview
•
Updated
•
8
ToxicityPrompts/PolyGuardMix
Viewer
•
Updated
•
1.91M
•
389
•
4
Viewer
•
Updated
•
481k
•
57
•
15
Preview
•
Updated
•
30
•
8
Viewer
•
Updated
•
124M
•
2.91k
•
17
linagora/linto-dataset-audio-ar-tn
Viewer
•
Updated
•
37.3k
•
730
•
14
Viewer
•
Updated
•
13.6k
•
776
•
27
Viewer
•
Updated
•
667k
•
1.49k
•
40
Viewer
•
Updated
•
9.71k
•
594
•
20
fr3on/election-questions-arabic
Viewer
•
Updated
•
1.49k
•
12
Updated
•
28
•
8
Viewer
•
Updated
•
3
•
18
•
1
Updated
•
244
•
22
papluca/language-identification
Viewer
•
Updated
•
90k
•
1.61k
•
63
vincentkoc/tiny_qa_benchmark_pp
Viewer
•
Updated
•
662
•
1.14k
•
2
Viewer
•
Updated
•
70.3M
•
943
•
17
Viewer
•
Updated
•
88.8k
•
11.3k
•
1.48k
Viewer
•
Updated
•
4.8k
•
17
s-nlp/EverGreen-Multilingual
Viewer
•
Updated
•
4.76k
•
18
•
1
camel-ai/ai_society_translated
Preview
•
Updated
•
26
•
16
LLaMAX/BenchMAX_Problem_Solving
Viewer
•
Updated
•
12.1k
•
134
•
1
alexandrainst/multi-wiki-qa
Viewer
•
Updated
•
1.22M
•
778
•
22
SaiedAlshahrani/Moroccan_Arabic_Wikipedia_20230101_nobots
Viewer
•
Updated
•
4.68k
•
20
•
3
Melaraby/EvArEST-dataset-for-Arabic-scene-text-recognition
Viewer
•
Updated
•
296k
•
69
mozilla-foundation/common_voice_17_0
Updated
•
2.48k
•
4
suchirsalhan/Phonemized-UD
Viewer
•
Updated
•
1.19M
•
256
LLMXperts/Arabic-NLi-Triplet
Viewer
•
Updated
•
571k
•
15
Updated
•
726
•
3
adithya7/xlel_wd_dictionary
Viewer
•
Updated
•
230k
•
70
•
3
Viewer
•
Updated
•
10k
•
257
•
54
Viewer
•
Updated
•
86.8M
•
235
•
23
Viewer
•
Updated
•
76.3k
•
1k
•
4
Viewer
•
Updated
•
78k
•
89
•
3
Viewer
•
Updated
•
46.2k
•
589
•
28
SaiedAlshahrani/Detect-Egyptian-Wikipedia-Articles
Viewer
•
Updated
•
756k
•
41
•
1
Omartificial-Intelligence-Space/Arabic-NLi-Pair
Viewer
•
Updated
•
328k
•
27
•
4
aida-ugent/llm-ideology-analysis
Viewer
•
Updated
•
315k
•
93
•
4
Viewer
•
Updated
•
1.2k
•
9
•
7
Viewer
•
Updated
•
206k
•
3.24k
•
338
Viewer
•
Updated
•
290k
•
284
•
42
Viewer
•
Updated
•
255k
•
89
•
5
Preview
•
Updated
•
150
•
3
tellarin-ai/ntx_llm_instructions
Viewer
•
Updated
•
5.98k
•
21
Viewer
•
Updated
•
29.2k
•
4.94k
•
36
UBC-NLP/nilechat-arabizi-mor
Viewer
•
Updated
•
1.45M
•
13
•
2
Viewer
•
Updated
•
2.14M
•
33
•
5
CohereLabs/include-lite-44
Viewer
•
Updated
•
10.8k
•
534
•
14
Viewer
•
Updated
•
3.48k
•
367
•
14
Viewer
•
Updated
•
7.35k
•
60
Viewer
•
Updated
•
5.16k
•
141
•
5
JQL-AI/JQL-Human-Edu-Annotations
Viewer
•
Updated
•
20.4k
•
22
•
5
Viewer
•
Updated
•
9.03B
•
24.4k
•
36
Viewer
•
Updated
•
310k
•
499
•
10
CohereLabs/fusion-pairwise-evals-finetuned
Viewer
•
Updated
•
5.25k
•
29
Viewer
•
Updated
•
400
•
36
•
8
Viewer
•
Updated
•
8.69k
•
17
•
1
faisaltareque/XL-HeadTags
Viewer
•
Updated
•
415k
•
27
•
3
Viewer
•
Updated
•
3.91M
•
84
•
6
Viewer
•
Updated
•
100
•
27
•
1
Viewer
•
Updated
•
798k
•
1.95k
•
88
Viewer
•
Updated
•
330
•
18
•
3
Viewer
•
Updated
•
94.4k
•
403
•
11
Updated
•
44
•
8
CohereLabs/fusion-synth-data-ufb
Viewer
•
Updated
•
94.7k
•
27
•
1
QCRI/AraDICE-ArabicMMLU-egy
Viewer
•
Updated
•
14.5k
•
224
•
1
Viewer
•
Updated
•
121
•
92
•
3
Viewer
•
Updated
•
2.97M
•
1.22k
•
29
ClusterlabAi/101_billion_arabic_words_dataset
Viewer
•
Updated
•
33.1M
•
538
•
72
omar-emad/financesecondtrial
Viewer
•
Updated
•
30
•
10
Viewer
•
Updated
•
11.4k
•
12
Viewer
•
Updated
•
695k
•
723
•
11
CohereLabs/deja-vu-pairwise-evals
Updated
•
18
•
3
kaust-generative-ai/fineweb-edu-ar
Viewer
•
Updated
•
363M
•
14
•
13
Preview
•
Updated
•
22
•
1
Viewer
•
Updated
•
893
•
26
•
1
Viewer
•
Updated
•
135k
•
342
•
1
UBC-NLP/nilechat-arabizi-egy
Viewer
•
Updated
•
572k
•
126
•
1
Viewer
•
Updated
•
761k
•
7
•
3
Viewer
•
Updated
•
11.1k
•
45
•
5
KFUPM-JRCAI/arabic-generated-abstracts
Viewer
•
Updated
•
8.39k
•
111
•
3
Viewer
•
Updated
•
5.73k
•
25
•
6
badrex/ALDi-predictions-MADIS5
Viewer
•
Updated
•
263
•
5
Viewer
•
Updated
•
467k
•
42
•
2
Viewer
•
Updated
•
10.1k
•
69
•
2
CohereLabs/include-base-44
Viewer
•
Updated
•
23k
•
4.17k
•
44
CohereLabs/m-ArenaHard-v2.0
Viewer
•
Updated
•
11.5k
•
165
•
5
Viewer
•
Updated
•
77.2M
•
2.09k
•
52
ToxicityPrompts/PolyGuardPrompts
Viewer
•
Updated
•
29.3k
•
255
•
2
Updated
•
219
•
2
SaiedAlshahrani/Egyptian_Arabic_Wikipedia_20230101
Viewer
•
Updated
•
728k
•
97
•
5
QCRI/AraDICE-ArabicMMLU-lev
Viewer
•
Updated
•
14.5k
•
201
Viewer
•
Updated
•
97.6k
•
942
•
47
Updated
•
1.05k
•
12
Viewer
•
Updated
•
141k
•
16
•
7
CohereLabsCommunity/afri-aya
Viewer
•
Updated
•
2.47k
•
128
•
11
Omar-youssef/Egyptian-text-summarization
Viewer
•
Updated
•
3.69k
•
19
jonathanmutal/Medical-Questionnaire-Multilingual-Translation
Preview
•
Updated
•
8
Updated
•
24.9k
•
41
CohereLabs/Global-MMLU-Lite
Viewer
•
Updated
•
10.9k
•
4.99k
•
30
MBZUAI/speecht5_tts_clartts_ar
Text-to-Speech
•
Updated
•
2.13k
•
27
LLaMAX/BenchMAX_General_Translation
Viewer
•
Updated
•
228k
•
208
abdullah-alamodi/aqeedah-rag-dataset
Viewer
•
Updated
•
5.42k
•
39
•
1
Viewer
•
Updated
•
63.8k
•
267
•
1
Viewer
•
Updated
•
127k
•
2.8k
•
30
Viewer
•
Updated
•
5.1M
•
538
•
48
sboughorbel/arabic-web-edu-seed
Viewer
•
Updated
•
236k
•
10
•
3
amphora/Open-R1-Mulitlingual-SFT
Viewer
•
Updated
•
128k
•
26
•
3
SaiedAlshahrani/Moroccan_Arabic_Wikipedia_20230101_bots
Viewer
•
Updated
•
5.4k
•
19
brighter-dataset/BRIGHTER-emotion-intensities
Viewer
•
Updated
•
41.2k
•
103
•
4
LLaMAX/BenchMAX_Domain_Translation
Viewer
•
Updated
•
47.3k
•
31
LLaMAX/BenchMAX_Rule-based
Viewer
•
Updated
•
7.29k
•
141
•
2
ELYADATA & LIA at NADI 2025: ASR and ADI Subtasks
Paper
•
2511.10090
•
Published
Viewer
•
Updated
•
393k
•
7.92k
•
516
Omar-youssef/islamic-qa-egyptian-arabic
Viewer
•
Updated
•
7.47k
•
17
alconost/alconost-multilingual-speech-gold
Viewer
•
Updated
•
330
•
23
LLaMAX/BenchMAX_Question_Answering
Viewer
•
Updated
•
17
•
29
2A2I/Arabic-OpenHermes-2.5
Viewer
•
Updated
•
982k
•
120
•
20
FreedomIntelligence/ApolloMoEDataset
Viewer
•
Updated
•
293k
•
285
•
5
SaiedAlshahrani/Arabic_Wikipedia_20230101_bots
Viewer
•
Updated
•
1.09M
•
58
•
1
UBC-NLP/palmx_2025_subtask1_culture
Viewer
•
Updated
•
4.5k
•
71
•
1
Viewer
•
Updated
•
17.6M
•
42
•
4
Viewer
•
Updated
•
8.79k
•
160
•
41
Viewer
•
Updated
•
158k
•
52
•
7
UBC-NLP/nilechat-fw-edu-egy
Viewer
•
Updated
•
5.52M
•
40
•
3
LLaMAX/BenchMAX_Model-based
Viewer
•
Updated
•
8.5k
•
62
Viewer
•
Updated
•
180
•
232
•
1
Raniahossam33/Arabic_cultural_dataset
Viewer
•
Updated
•
12.1k
•
7
•
2
Preview
•
Updated
•
12
Viewer
•
Updated
•
380M
•
13.4k
•
55
Viewer
•
Updated
•
7.18B
•
37k
•
587
visheratin/laion-coco-nllb
Viewer
•
Updated
•
894k
•
438
•
44
obadx/recitation-segmentation-augmented
Viewer
•
Updated
•
64.6k
•
134
Viewer
•
Updated
•
159M
•
1.75k
•
12
Viewer
•
Updated
•
2.56M
•
6.92k
•
80
Viewer
•
Updated
•
602k
•
13.8k
•
150
Viewer
•
Updated
•
13.2k
•
45
•
2
rabah2026/Quran-Ayah-Corpus
Viewer
•
Updated
•
263k
•
110
•
1
omar-emad/FinanceTripletSecond
Viewer
•
Updated
•
30
•
5
Viewer
•
Updated
•
3.3k
•
132
•
11
Viewer
•
Updated
•
6.98k
•
69
•
10
Viewer
•
Updated
•
1.05M
•
41
•
12
UBC-NLP/palmx_2025_subtask2_islamic
Viewer
•
Updated
•
1.9k
•
60
Viewer
•
Updated
•
388
•
98
rubricreward/m-reward-bench
Viewer
•
Updated
•
66k
•
2
Fujitsu-FRE/MAPS_Verified
Viewer
•
Updated
•
3.05k
•
4.06k
•
3
Viewer
•
Updated
•
135k
•
6.26k
•
285
LLaMAX/BenchMAX_Multiple_Functions
Viewer
•
Updated
•
5.41k
•
28
Fumika/Wikinews-multilingual
Viewer
•
Updated
•
15.2k
•
15
•
7
Omartificial-Intelligence-Space/awesome_chatgpt_prompts_ar
Viewer
•
Updated
•
201
•
28
•
1
mrlbenchmarks/global-piqa-nonparallel
Viewer
•
Updated
•
11.6k
•
2.55k
•
32
NAMAA-Space/QariOCR-v0.3-markdown-mixed-dataset
Viewer
•
Updated
•
37k
•
79
•
11
Viewer
•
Updated
•
1.49M
•
17
•
3
Viewer
•
Updated
•
23k
•
1.09k
•
1
m0pper/Small-Multilingual-Corpora
Viewer
•
Updated
•
7.61M
•
44
Viewer
•
Updated
•
236k
•
45
Preview
•
Updated
•
8
haoranxu/X-ALMA-Preference
Viewer
•
Updated
•
772k
•
44
•
6
SaiedAlshahrani/Arabic_Wikipedia_20230101_nobots
Viewer
•
Updated
•
847k
•
37
•
2
Viewer
•
Updated
•
367
•
26
•
2
vgaraujov/semeval-2025-task11-track-c
Viewer
•
Updated
•
57.3k
•
40
Viewer
•
Updated
•
935
•
78
•
1
Viewer
•
Updated
•
3.94k
•
76
Viewer
•
Updated
•
7.62k
•
16.3k
•
3
Viewer
•
Updated
•
10.4k
•
2.06k
•
35
Updated
•
275
•
124
brighter-dataset/BRIGHTER-emotion-categories
Viewer
•
Updated
•
140k
•
596
•
14
lukasellinger/homonym-mcl-wic
Viewer
•
Updated
•
1.61k
•
69
Viewer
•
Updated
•
160
•
18
•
3
Preview
•
Updated
•
7
HeshamHaroon/Arabic_Function_Calling
Viewer
•
Updated
•
50.8k
•
159
•
59