SLPG/Punjabi_Shahmukhi_to_Gurmukhi_Transliteration

Punjabi Gurmukhi to Shahmukhi Transliteration System

Our supervised Punjabi transliteration systems built using unsupervised corpus are bidirectional NMT systems which effectively convert text between Gurmukhi and Shahmukhi scripts. The Gurmukhi-to-Shahmukhi model achieves a 98.1 BLEU score and 99.5% word-level accuracy, while the Shahmukhi-to-Gurmukhi model scores 87.7 BLEU.

Corpus Details

Total Sentences: 6.3 million
Domains Covered: Various domains including CCaligned, ccmatrix, TED, QED, OPUS, TIco, Wikimedia, Multicclaigned, Emille, IJCNLP, xlent, and paracrawl.
Test Corpus: FLORES-101

Model Details

- **BLEU Score:** 87.7

You may also explore our Gurmukhi-to-Shahmukhi Model with BLEU Score: of 98.1 here.

Usage

These resources are intended to facilitate research and development in the field of Punjabi transliteration. They can be used to train new models or improve existing ones, enabling high-quality transliteration between Gurmukhi and Shahmukhi scripts.

Citation

If you use our model, kindly cite our paper:

@article{Shehzadi2024,
  title={Unsupervised Punjabi Corpus and Neural Machine Transliteration
 System},
  author={Shehzadi Ambreen, Sadaf Abdul Rauf, MG Abbas Malik and Muhammad Imran },      journal={Heliyon},
  year={2024},
  note={Under review}
 }

SLPG
/

Punjabi_Shahmukhi_to_Gurmukhi_Transliteration

Punjabi Gurmukhi to Shahmukhi Transliteration System

Corpus Details

Model Details

Usage

Citation

Dataset used to train SLPG/Punjabi_Shahmukhi_to_Gurmukhi_Transliteration

Collection including SLPG/Punjabi_Shahmukhi_to_Gurmukhi_Transliteration

Low Resource Transliteration