Tested on handwritten historical manuscripts

#7
by badrex - opened

Hi

Thanks a lot for developing and sharing the model. It is indeed a valuable contribution to the Arabic language tech community.

I played a bit with the model using some handwritten historical manuscripts and I observed something interesting.

The behavior is illustrated in the image below. Even though the input has no diacritics, the output is fully diacritized.

Is this an expected behavior? if no, what might drive this?

Best,
Badr

Screenshot from 2025-06-15 18-41-26.png

Another example

Screenshot from 2025-06-15 18-51-25.png

Network for Advancing Modern ArabicNLP & AI org

Thank you for the feedback, this is probably caused by the font that the model hasn't seen before and is close to some of the training that was done with diacritics. We can take a look more and see, also if you have a dataset we can refintune the model to support this font and style.

Sign up or log in to comment