Arabic Small Nougat

#1
by johnlockejrr - opened

Sorry to disturb, can you kindly share the method you used to train this beautiful model (maybe python script/notebook)? I'm trying to train an Arabic model for some medieval manuscripts, I have groundtruth as ALTO (I can convert it to image/csv or text easily), how should the original dataset look like? I see your dataset you used but is already in pickle format so I don't know how the raw data looked like. Thank you!

Hello @johnlockejrr ,

I am working on a larger variant of this model and with its release i will open source my datasets, training code and paper explaining everything.

Happy that the model is beneficial for you ^^

Wow! Thank you so much @MohamedRashad ! Can't wait!

Any updates? 😇

@johnlockejrr
Still working on it

أي أخبار جديدة يا أخي؟

@johnlockejrr You are a very patient man
I hope you like the new models 🤗

MohamedRashad changed discussion status to closed

Very very much! I love Yahya's cover in the model card (سكنه فسيح جناته).
I love your work, bro!
Do you have الشوك والقرنفل in the dataset?

@johnlockejrr
Maybe, I didn't go after each entry in the dataset

Thanks! No problem, I will take a look myself.

Sign up or log in to comment