AtlasIA

community

https://atlasia.ma/

atlasia-ma

Activity Feed

AI & ML interests

Open-source data and models for Morocco.

Recent Activity

AymaneElfirdo updated a dataset 1 day ago

atlasia/chatbot-arena-db

oumayma03 updated a Space 5 days ago

atlasia/moul_lmemes

01Yassine authored a paper 27 days ago

L1-aware Multilingual Mispronunciation Detection Framework

View all activity

Articles

Creating your custom Ghibli Text-to-Image model

May 1

• 17

Atlaset Dataset for Moroccan Darija: From Data Collection, Analysis, to Model Trainings

Mar 6

• 25

Darija Chatbot Arena: Making LLMs Compete in the Moroccan Dialect

Feb 10

• 14

AymaneElfirdo

updated a dataset 1 day ago

atlasia/chatbot-arena-db

Updated 1 day ago • 60 • 1

oumayma03

updated a Space 5 days ago

Memes Lharba

🏢

Find Moroccan memes based on your query

01Yassine

authored 12 papers 27 days ago

L1-aware Multilingual Mispronunciation Detection Framework

Paper • 2309.07719 • Published Sep 14, 2023

SpeechBlender: Speech Augmentation Framework for Mispronunciation Data Generation

Paper • 2211.00923 • Published Nov 2, 2022

Automatic Pronunciation Assessment -- A Review

Paper • 2310.13974 • Published Oct 21, 2023

BiCrossMamba-ST: Speech Deepfake Detection with Bidirectional Mamba Spectro-Temporal Cross-Attention

Paper • 2505.13930 • Published May 20

Comprehensive Layer-wise Analysis of SSL Models for Audio Deepfake Detection

Paper • 2502.03559 • Published Feb 5

Beyond Orthography: Automatic Recovery of Short Vowels and Dialectal Sounds in Arabic

Paper • 2408.02430 • Published Aug 5, 2024

Towards a Unified Benchmark for Arabic Pronunciation Assessment: Quranic Recitation as Case Study

Paper • 2506.07722 • Published 28 days ago

alielfilali01

authored a paper about 1 month ago

Llama-3-Nanda-10B-Chat: An Open Generative Large Language Model for Hindi

Paper • 2504.06011 • Published Apr 8 • 2

lekssays

authored a paper about 2 months ago

TechniqueRAG: Retrieval Augmented Generation for Adversarial Technique Annotation in Cyber Threat Intelligence Text

Paper • 2505.11988 • Published May 17 • 2

alielfilali01

posted an update 2 months ago

Post

712

Great efforts from @AtlasIA folks to adapt text2image models (ghibli style) for Moroccan Context

Read the blog is here : https://huggingface.co/blog/atlasia/creating-your-custom-ghibli-text-to-image-model

nouamanetazi

authored a paper 3 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 192

alielfilali01

posted an update 5 months ago

Post

1050

🚨 Arabic LLM Evaluation 🚨

Few models join the ranking of https://huggingface.co/spaces/inceptionai/AraGen-Leaderboard Today.

The new MistralAI model, Saba, is quite impressive, Top10 ! Well done @arthurmensch and team.

Sadly Mistral did not follow its strategy about public weights this time, we hope this changes soon and we get the model with a permissive license.

We added other Mistral models and apparently, we have been sleeping on mistralai/Mistral-Large-Instruct-2411 !

Another impressive model that joined the ranking today is ALLaM-AI/ALLaM-7B-Instruct-preview. After a long wait finally ALLaM is here and it is IMPRESSIVE given its size !

ALLaM is ranked on OALL/Open-Arabic-LLM-Leaderboard as well.

alielfilali01

posted an update 6 months ago

Post

2137

3C3H AraGen Leaderboard welcomes today deepseek-ai/DeepSeek-V3 and 12 other models (including the late gpt-3.5 💀) to the ranking of best LLMs in Arabic !

Observations:
- DeepSeek-v3 ranked 3rd and only Open model among the top 5 !

- A 14B open model ( Qwen/Qwen2.5-14B-Instruct) outperforms gpt-3.5-turbo-0125 (from last year). This shows how much we came in advancing and supporting Arabic presence within the LLM ecosystem !

- Contrary to what observed in likelihood-acc leaderboards (like OALL/Open-Arabic-LLM-Leaderboard) further finetuned models like maldv/Qwentile2.5-32B-Instruct actually decreased the performance compared to the original model Qwen/Qwen2.5-32B-Instruct.
It's worth to note that the decrease is statiscally insignificant which imply that at best, the out-domain finetuning do not really hurts the model original capabilities acquired during pretraining.
Previous work addressed this (finetuning VS pretraining) but more investigation in this regard is required (any PhDs here ? This could be your question ...)

Check out the latest rankings: https://huggingface.co/spaces/inceptionai/AraGen-Leaderboard

AI & ML interests

Recent Activity

Articles

Creating your custom Ghibli Text-to-Image model

Atlaset Dataset for Moroccan Darija: From Data Collection, Analysis, to Model Trainings

Darija Chatbot Arena: Making LLMs Compete in the Moroccan Dialect

Team members 18

atlasia's activity

Memes Lharba