Abstract
In this paper, we propose a way of incorporating morphological resources for enhancing the performance of neural network based dependency parsing. We conduct our experiments in Hindi, which is a morphologically rich language. We report our results on two well known Hindi Dependency Parsing datasets. We show an improvement of both Unlabeled Attachment Score (UAS) and Labeled Attachment Score (LAS) compared to previous state-of-the art hindi dependency parsers using only word embeddings, POS tag embeddings and arc-label embeddings as features. Using morphological features, such as number, gender, person and case of words, we achieve an additional improvement of both LAS and UAS. We find that many of the erroneous sentences contain Named Entities. We propose a treatment for Named Entities which further improves both UAS and LAS of our Hindi dependency parser (The parser is available at http://www.cicling.org/2016/data/126/CICLing_126.zip).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
References
Kübler, S., McDonald, R., Nivre, J.: Dependency Parsing. Synthesis Lectures on Human Language Technologies, vol. 1, pp. 1–127 (2009)
Nivre, J.: An efficient algorithm for projective dependency parsing. In: Proceedings of the 8th International Workshop on Parsing Technologies (IWPT). Citeseer (2003)
McDonald, R., Pereira, F., Ribarov, K., Hajič, J.: Non-projective dependency parsing using spanning tree algorithms. In: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 523–530. Association for Computational Linguistics (2005)
Chen, D., Manning, C.: A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, pp. 740–750. Association for Computational Linguistics (2014)
Guo, J., Che, W., Yarowsky, D., Wang, H., Liu, T.: Cross-lingual dependency parsing based on distributed representations (2015)
Socher, R., Bauer, J., Manning, C.D., Ng, A.Y.: Parsing with compositional vector grammars. In: Proceedings of the ACL Conference. Citeseer (2013)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
Bordes, A., Weston, J., Collobert, R., Bengio, Y.: Learning structured embeddings of knowledge bases. In: AAAI (2011)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Bhat, R.A., et al.: The Hindi/Urdu treebank project. In: Ide, N., Pustejovsky, J. (eds.) Handbook of Linguistic Annotation, pp. 659–697. Springer, Amsterdam (2017). https://doi.org/10.1007/978-94-024-0881-2_24
Nivre, J.: Parsing Indian languages with MaltParser. In: Proceedings of the NLP Tools Contest: Indian Language Dependency Parsing, ICON 2009, pp. 12–18 (2009)
Jain, S., Jain, N., Tammewar, A., Bhat, R.A., Sharma, D.M.: Exploring semantic information in Hindi wordnet for Hindi dependency parsing (2013)
Ambati, B.R., Deoskar, T., Steedman, M.: Using CCG categories to improve Hindi dependency parsing. In: ACL, vol. 2, pp. 604–609 (2013)
Faruqui, M., Kumar, S.: Multilingual open relation extraction using cross-lingual projection. arXiv preprint arXiv:1503.06450 (2015)
Palmer, M., Bhatt, R., Narasimhan, B., Rambow, O., Sharma, D.M., Xia, F.: Hindi syntax: annotating dependency, lexical predicate-argument structure, and phrase structure. In: The 7th International Conference on Natural Language Processing, pp. 14–17 (2009)
Zhang, Y., Clark, S.: Syntactic processing using the generalized perceptron and beam search. Comput. Linguist. 37, 105–151 (2011)
Narayan, D., Chakrabarti, D., Pande, P., Bhattacharyya, P.: An experience in building the indo wordnet-a wordnet for Hindi. In: First International Conference on Global WordNet, Mysore, India (2002)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)
Acknowledgement
Professor Sudeshna Sarkar acknowledges DEITY, Government of India for support under the ILMT Project.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Saha, A., Sarkar, S. (2018). Enhancing Neural Network Based Dependency Parsing Using Morphological Information for Hindi. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science(), vol 9623. Springer, Cham. https://doi.org/10.1007/978-3-319-75477-2_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-75477-2_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75476-5
Online ISBN: 978-3-319-75477-2
eBook Packages: Computer ScienceComputer Science (R0)