Different variations of diacritics

#3
by thewh1teagle - opened

I would like to get multiple variations of diacritics for sentence

For instance with 'Shalom Olam'
ืฉืœื•ื ืขื•ืœื
The diacritics are 'Shlom Olam'
ืฉืึฐืœื•ึนื ืขื•ึนืœึธื

I tried to implement beam search but couldn't get different variations
Thanks

thewh1teagle changed discussion title from Beam search example to Different variations of diacritics

Seems the vocalization is for peace of world not Hello world! :)

DICTA: The Israel Center for Text Analysis org

Indeed, the current architecture does not allow retrieving multiple variations of diacritics for each word/the sentence. We are looking into training a model with a different architecture, but that is currently only in research.

I noticed some differences in the nikud from Dicta website in terms of modernity
For instance when I hit ืฉืœื•ื ืขื•ืœื in Dicta website it's Shalom Olam but in the model it's Shlom Olam, it's like the model nikud is a bit 'less modern' than Dicta website. That's why I asked for a way to get variations.

If you plan another one, I wish that you can include more modern nikud, and Shva Na and Atama'a! ; )
Thank you very much for the model. very appreciated.

Sign up or log in to comment