Abstract
Reduplicated Multiword Expressions (RMWEs) are abundant in Manipuri, the highly agglutinative India language. The Part of Speech (POS) tagging of Manipuri using Support Vector Machine (SVM) has been developed and evaluated. The POS tagger has been updated with identified RMWEs as another feature. The performance of the SVM based POS tagger before and after adding RMWE as a feature have been compared. The SVM based POS tagger has been evaluated with the F-Score of 77.67% which has increased to 79.61% with RMWE as an additional feature. Thus the performance the POS tagger has improved after adding RMWE as an additional feature.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brill, E.: A Simple Rule-based Part of Speech Tagger. In: The Proceedings of Third International Conference on Applied Natural Language Processing. ACL, Trento (1992)
Brill, E.: Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part of Speech Tagging. Computational Linguistics 21(4), 543–545 (1995)
Ratnaparakhi, A.: A maximum entropy Parts-of-Speech Tagger. In: The Proceedings EMNLP, vol. 1, pp. 133–142. ACL (1996)
Kupiec, R.: Part-of-speech tagging using a Hidden Markov Model. Computer Speech and Language 6(3), 225–242 (1992)
Lin, Y.C., Chiang, T.H., Su, K.Y.: Discrimination oriented probabilistic tagging. In: The Proceedings of ROCLING V, pp. 87–96 (1992)
Chang, C.H., Chen, C.D.: HMM-based Part-of-Speech Tagging for Chinese Corpora. In: The Proceedings of the Workshop on Very Large Corpora, Columbus, Ohio, pp. 40–47 (1993)
Lua, K.T.: Part of Speech Tagging of Chinese Sentences Using Genetic Algorithm. In: The Proceedings of ICCC 1996, pp. 45–49. National University of Singapore (1996)
Ekbal, A., Mondal, S., Bandyopadhyay, S.: POS Tagging using HMM and Rule-based Chunking. In: The Proceedings of SPSAL 2007, IJCAI, India, pp. 25–28 (2007)
Ekbal, A., Haque, R., Bandyopadhyay, S.: Bengali Part of Speech Tagging using Conditional Random Field. In: The Proceedings 7th SNLP, Thailand (2007)
Ekbal, A., Haque, R., Bandyopadhyay, S.: Maximum Entropy based Bengali Part of Speech Tagging. In: Gelbukh, A. (ed.) Advances in Natural Language Processing and Applications, vol. (33), pp. 67–78 (2008)
Singh, S., Gupta, K., Shrivastava, M., Bhattacharya, P.: Morphological Richness offsets Resource Demand–Experiences in constructing a POS tagger for Hindi. In: The Proceedings of COLING-ACL, Sydney, Australia (2006)
Antony, P.J., Mohan, S.P., Soman, K.P.: SVM Based Part of Speech Tagger for Malayalam. In: The Proceedings of International Conference on Recent Trends in Information, Telecommunication and Computing (ITC), Kochi, Kerala, India, pp. 339–341 (2010)
Ekbal, A., Mondal, S., Bandyopadhyay, S.: Part of Speech Tagging in Bengali Using SVM. In: Proceedings of International Conference on Information Technology (ICIT), Bhubaneswar, India, pp. 106–111 (2008)
Doren Singh, T., Bandyopadhyay, S.: Morphology Driven Manipuri POS Tagger. In: The Proceeding of IJCNLP NLPLPL 2008, IIIT Hyderabad, pp. 91–97 (2008)
Doren Singh, T., Ekbal, A., Bandyopadhyay, S.: Manipuri POS tagging using CRF and SVM: A language independent approach. In: The Proceeding of 6th International Conference on Natural Language Processing (ICON 2008), Pune, India, pp. 240–245 (2008)
Kishorjit, N., Bandyopadhyay, S.: Identification of Reduplicated MWEs in Manipuri: A Rule based Approached. In: The Proceeding of 23rd International Conference on the Computer Processing of Oriental Languages (ICCPOL 2010), Redwood City, San Francisco, pp. 49–54 (2010)
Nongmeikapam, K., Laishram, D., Singh, N.B., Chanu, N.M., Bandyopadhyay, S.: Identification of Reduplicated Multiword Expressions Using CRF. In: Gelbukh, A. (ed.) CICLing 2011, Part I. LNCS, vol. 6608, pp. 41–51. Springer, Heidelberg (2011)
Doren Singh, T., Bandyopadhyay, S.: Web Based Manipuri Corpus for Multiword NER and Reduplicated MWEs Identification using SVM. In: The Proceedings of the 1st Workshop on South and Southeast Asian Natural Language Processing (WSSANLP), the 23rd International Conference on Computational Linguistics (COLING), Beijing, pp. 35–42 (2010)
Nonigopal Singh, N.: A Meitei Grammar of Roots and Affixes. A Thesis. Manipur University, Imphal (1987) (unpublish)
Yashawanta, C.S.: Manipuri Grammar, pp. 190–204. Rajesh Publications, Delhi (2000)
Kishorjit, N., Bishworjit, S., Romina, M., Chanu, N.M., Bandyopadhyay, S.: A Light Weight Manipuri Stemmer. In: The Proceedings of National Conference on Indian Language Computing (NCILC), Chochin, India (2011)
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)
Huang, C.-L., Wang, C.-J.: A GA-based feature selection and parameters optimization for support vector machines. Expert Systems with Applications 31, 231–240 (2006), doi:10.1016/j.eswa.2005.09.024
Joachims, T.: Making Large Scale SVM Learning Practical. In: Scholkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods-Support Vector Learning (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nongmeikapam, K., Sharma, A.U., Devi, L.M., Keisam, N., Singh, K.D., Bandyaopadhyay, S. (2012). Will the Identification of Reduplicated Multiword Expression (RMWE) Improve the Performance of SVM Based Manipuri POS Tagging?. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science, vol 7181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28604-9_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-28604-9_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28603-2
Online ISBN: 978-3-642-28604-9
eBook Packages: Computer ScienceComputer Science (R0)