skip to main content
10.1145/3459104.3459146acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiseeieConference Proceedingsconference-collections
research-article

Probabilistic and Neural Network Based POS Tagging of Ambiguous Nepali text: A Comparative Study

Published:20 July 2021Publication History

ABSTRACT

There are various approaches to the problem of assigning each word of a text with a parts-of-speech tag, which is known as Part-Of-Speech (POS) tagging. This article presents a comprehensive study and comparison of two different techniques of Part-of-Speech (POS) Tagging for Nepali text viz. Hidden Markov Model (HMM) and General Regression Neural Network (GRNN) based. The POS taggers resolves the problem of ambiguity in POS tagging of Nepali text through two different approaches. The evaluation of the taggers are done on the corpora developed and provided by TDIL (Technology Development for Indian Languages). Apart from corpora, python and Java programming languages and the NLTK Toolkit library has been used for implementation. Both the tagger achieves accuracy of 100 percent for known words (with no ambiguity), 58.29 percent (HMM) and 60.45 percent (GRNN) for ambiguous words and 85.36 percent (GRNN) for non- ambiguous unknown words.

References

  1. Jayaraj Acharya. 1991. A Descriptive Grammar of Nepali and an Analyzed Corpus (1st. ed.). Georgetown University Press, Washington, D.C.Google ScholarGoogle Scholar
  2. Bal K. Bal. 2004. Structure of Nepali Grammar (1st. ed.). Madan Puraskar Pustakalaya, Nepal.Google ScholarGoogle Scholar
  3. Asif Ekbal, Rejwanul Haque, and Sivaji Bandyopadhyay. 2008. Maximum Entropy Based Bengali Part of Speech Tagging. Advances in Natural Language Processing and Applications Research in Computing Science 33 (2008), 67–78.Google ScholarGoogle Scholar
  4. David G. Forney. 1973. The viterbi algorithm. In Proceedings of the IEEE (3), Vol. 61. IEEE, 268–278. https://doi.org/10.1109/PROC. 1973.9030Google ScholarGoogle ScholarCross RefCross Ref
  5. Fahim M. Hasan, Naushad UzZaman, and Mumit Khan. 2007. Compar- ison of different POS Tagging Techniques (n-gram, HMM and Brill's tagger) for Bangla. Advances and Innovations in Systems, Comput- ing Sciences and Software Engineering (Springer) (2007), 121–126. https://doi.org/10.1007/978-1-4020-6264-3_23Google ScholarGoogle Scholar
  6. Simon Haykin. 1999. Neural Networks A Comprehensive Foundation (2nd. ed.). G Prentice Hall International, Inc., New Jersey.Google ScholarGoogle Scholar
  7. Nisheeth Joshi, Hemant Darbari, and Iti Mathur. 2013. HMM BASED POS TAGGER FOR HINDI.Google ScholarGoogle Scholar
  8. Andrew MacKinlay. 2005. The Effects of Part-of-Speech Tagsets on Tagger Performance (Bachelor's thesis). Master's thesis. University of Melbourne, Melbourne, Australia.Google ScholarGoogle Scholar
  9. Indian Language Technology Proliferation and Deployment Center. 2019. . Retrieved 2018 from http://tdil-dc.in/index.php?lang=enGoogle ScholarGoogle Scholar
  10. FA Shamsi and Ahmed Guessoum. [n.d.]. A Hidden Markov Model-Based POS Tagger for Arabic. In proceedings of 8th International Conference on Textual Data Statistical Analysis.Google ScholarGoogle Scholar
  11. Tanveer Siddiqui and Uma S. Tiwary. 2008. Natural Language Processing and Information Retrieval (1st. ed.). Oxford University Press, United Kingdom.Google ScholarGoogle Scholar
  12. Tien-Ping Tan, Bali R. Malançon, Laurent Besacier, Yin-Lai Yeong, Keng H. Gan, and Enya K. Tang. 2018. Evaluating LSTM Networks, HMM and WFST in Malay Part-of-Speech Tagging. Journal of elecommunication, Electronic and Computer Engineering 9, 2 (2018), 79–83.Google ScholarGoogle Scholar
  13. Scott M. Thede and Mary P. Harper. 1999. Second-Order Hidden Markov Model for Part-of-Speech Tagging. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics. Association for Computational Linguistics, College Park, Maryland, USA, 175–182.Google ScholarGoogle Scholar
  14. Archit Yajnik. [n.d.]. General Regression Neural Network Based PoS Tagging for Nepali Text. In Dhinaharan Nagamalai et al. (Eds) : NATL, CSEA, DMDBS, Fuzzy, ITCON, NSEC, COMIT - 2018.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Probabilistic and Neural Network Based POS Tagging of Ambiguous Nepali text: A Comparative Study
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              ISEEIE 2021: 2021 International Symposium on Electrical, Electronics and Information Engineering
              February 2021
              644 pages
              ISBN:9781450389839
              DOI:10.1145/3459104

              Copyright © 2021 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 20 July 2021

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed limited

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format .

            View HTML Format