Abstract
In this paper we explore different approaches for parsing Telugu. We consider three popular dependency parsers namely, MaltParser, MSTParser and TurboParser. We first experiment with different parser and feature settings and show the impact of different settings. We then explore different ways of ensembling these parsers. We also provide a detailed analysis of the performance of all the approaches on major dependency labels and different distance ranges. We report our results on test data of Telugu dependency treebank provided in the ICON 2010 tools contest on Indian languages dependency parsing. We obtain state-of-the art performance of 91.8% in unlabelled attachment score and 70.0% in labelled attachment score.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ambati, B.R., Gadde, P., Jindal, K.: Experiments in Indian language dependency parsing. In: Proceedings of the ICON09 NLP Tools Contest: Indian Language Dependency Parsing, pp. 32–37 (2009)
Ambati, B.R., Husain, S., Jain, S., Sharma, D.M., Sangal, R.: Two methods to incorporate ‘Local Morphosyntactic’ features in Hindi dependency parsing. In: Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages, Los Angeles, USA, pp. 22–30 (2010)
Bharati, A., Chaitanya, V., Sangal, R.: Natural Language Processing: A Paninian Perspective, pp. 65–106. Prentice-Hall of India, New Delhi (1995)
Bharati, A., Sangal, R., Sharma, D.M., Bai, L.: AnnCorra: Annotating Corpora Guidelines for POS and Chunk Annotation for Indian Languages. Technical Report (TR- LTRC-31), LTRC, IIIT-Hyderabad (2006)
Bharati, A., Sangal, R., Sharma, D.M.: SSF: Shakti Standard Format Guide. Technical Report (TR-LTRC-33), LTRC, IIIT-Hyderabad (2007)
Bharati, A., Sharma, D.M., Husain, S., Bai, L., Begum, R., Sangal, R.: AnnCorra: TreeBanks for Indian Languages, Guidelines for Annotating Hindi TreeBank (version 2.0) (2009). http://ltrc.iiit.ac.in/MachineTrans/research/tb/DS-guidelines/DSguidelines-ver2-28-05-09.pdf
Bharati, A., Mannem, P., Sharma, D.M.: Hindi parsing shared task. In: Proceedings of Coling Workshop on Machine Translation and Parsing in Indian Languages, Kharagpur, India (2012)
Buchholz, S., Marsi, E.: CoNLL-X shared task on multilingual dependency parsing. In: Proceedings of the Tenth Conference on Computational Natural Language Learning, New York City, New York, pp. 149–164 (2006)
Husain, S.: Dependency parsers for Indian languages. In: Proceedings of the ICON09 NLP Tools Contest: Indian Language Dependency Parsing, India (2009)
Husain, S., Mannem, P., Ambati, B.R., Gadde, P.: The ICON-2010 tools contest on indian language dependency parsing. In: Proceedings of ICON- 2010 Tools Contest on Indian Language Dependency Parsing, Kharagpur, India (2010)
Kesidi, S.R., Kosaraju, P., Vijay, M., Husain, S.: A two stage constraint based hybrid dependency parser for Telugu. In: Proceedings of the ICON- 2010 Tools Contest on Indian Language Dependency Parsing (2010)
Kosaraju, P., Kesidi, S.R., Ainavolu, V.B.R., Kukkadapu, P.: Experiments on Indian language dependency parsing. In: Proceedings of the ICON-2010 Tools Contest on Indian Language Dependency Parsing (2010)
Kukkadapu, P., Malladi, D., Dara, A.: Ensembling various dependency parsers: adopting turbo parser for Indian languages. In: Proceeding of Coling 2012 Workshop on MT and Parsing in Indian Languages (2012)
Kumari, B.V.S., Rao, R.R.: Hindi dependency parsing using a combined model of Malt and MST. In: Proceeding of Coling 2012 Workshop on MT and Parsing in Indian Languages (2012)
Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of English: the penn treebank. Comput. Linguist. 19(2), 313–330 (1993)
Martins, A., Smith, N., Xing, E.: Concise integer linear programming formulations for depend-ency parsing. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Suntec, Singapore, pp. 342–350 (2009)
McDonald, R.: Discriminative learning and spanning tree algorithms for dependency parsing. PhD thesis, Philadelphia, PA, USA (2006)
McDonald, R., Nivre, J.: Characterizing the errors of data-driven dependency parsing models. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing and Natural Language Learning (2007)
McDonald, R., Crammer, K., Pereira, F.: Online large-margin training of dependency parsers. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Ann Arbor, Michigan, pp. 91–98 (2005)
Nivre, J.: Parsing Indian languages with MaltParser. In: Proceedings of the ICON09 NLP Tools Contest: Indian Language Dependency Parsing (2009)
Nivre, J., Hall, J., K¨ubler, S., McDonald, R., Nilsson, J., Riedel, S., Yuret, D.: The CoNLL 2007 shared task on dependency parsing. In: Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007, Prague, Czech Republic, pp. 915–932 (2007a)
Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kübler, S., Marinov, S., Marsi, E.: Malt-parser: a language-independent system for data-driven dependency parsing. Nat. Lang. Eng. 13(2), 95–135 (2007b)
Sagae, K., Lavie, A.: Parser combination by reparsing. In: Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers, pp. 129–132 (2006)
Zeman, D.: Maximum spanning malt: hiring world’s leading dependency parsers to plant Indian trees. In: Proceedings of the ICON09 NLP Tools Contest: Indian Language Dependency Parsing (2009)
Acknowledgements
We would also like to thank Language Technologies Research Centre (LTRC), International Institute of Information Technology, Hyderabad (IIIT-H) for providing the Telugu dependency treebank.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Venkata Seshu Kumari, B., Giri Prasaad, A., Susmitha, M., R., V.R., Bhatnagar, R. (2020). Exploring Different Approaches for Parsing Telugu. In: Hassanien, A., Azar, A., Gaber, T., Bhatnagar, R., F. Tolba, M. (eds) The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2019). AMLTA 2019. Advances in Intelligent Systems and Computing, vol 921. Springer, Cham. https://doi.org/10.1007/978-3-030-14118-9_55
Download citation
DOI: https://doi.org/10.1007/978-3-030-14118-9_55
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14117-2
Online ISBN: 978-3-030-14118-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)