Abstract:
Development of an effective system for transliterating loanwords and named-entities across orthographically very different languages is a challenging task. In general, su...Show MoreMetadata
Abstract:
Development of an effective system for transliterating loanwords and named-entities across orthographically very different languages is a challenging task. In general, such problems are addressed using two approaches; (i) dictionary-based mapping, and (ii) machine learning techniques. Most of the dictionary-based approaches focus on developing either phoneme or grapheme based mapping rules. In this paper, we investigate the effect of various transliteration models for transliterating English loanwords and named-entities to Manipuri language. First we compare performance between the state-of-the-art learning models namely RNN, LSTM and Bi-LSTM using seq2seq auto-encoder [1], and dictionary-based models. From various experimental observations, BiLSTM is found to be outperforming its counterparts. Further, we compare the performances of different transliteration methods over phoneme-based representation and grapheme-based representation. It is observed from experiments that grapheme-based representation outperforms its phoneme-based counterparts in both dictionary and learning based methods.
Date of Conference: 15-17 November 2018
Date Added to IEEE Xplore: 31 January 2019
ISBN Information: