Enhancing Neural Network Based Dependency Parsing Using Morphological Information for Hindi

Saha, Agnivo; Sarkar, Sudeshna

doi:10.1007/978-3-319-75477-2_26

Enhancing Neural Network Based Dependency Parsing Using Morphological Information for Hindi

Conference paper
First Online: 21 March 2018

1353 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9623))

Abstract

In this paper, we propose a way of incorporating morphological resources for enhancing the performance of neural network based dependency parsing. We conduct our experiments in Hindi, which is a morphologically rich language. We report our results on two well known Hindi Dependency Parsing datasets. We show an improvement of both Unlabeled Attachment Score (UAS) and Labeled Attachment Score (LAS) compared to previous state-of-the art hindi dependency parsers using only word embeddings, POS tag embeddings and arc-label embeddings as features. Using morphological features, such as number, gender, person and case of words, we achieve an additional improvement of both LAS and UAS. We find that many of the erroneous sentences contain Named Entities. We propose a treatment for Named Entities which further improves both UAS and LAS of our Hindi dependency parser (The parser is available at http://www.cicling.org/2016/data/126/CICLing_126.zip).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
http://code.google.com/p/word2vec/.

References

Kübler, S., McDonald, R., Nivre, J.: Dependency Parsing. Synthesis Lectures on Human Language Technologies, vol. 1, pp. 1–127 (2009)
Google Scholar
Nivre, J.: An efficient algorithm for projective dependency parsing. In: Proceedings of the 8th International Workshop on Parsing Technologies (IWPT). Citeseer (2003)
Google Scholar
McDonald, R., Pereira, F., Ribarov, K., Hajič, J.: Non-projective dependency parsing using spanning tree algorithms. In: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 523–530. Association for Computational Linguistics (2005)
Google Scholar
Chen, D., Manning, C.: A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, pp. 740–750. Association for Computational Linguistics (2014)
Google Scholar
Guo, J., Che, W., Yarowsky, D., Wang, H., Liu, T.: Cross-lingual dependency parsing based on distributed representations (2015)
Google Scholar
Socher, R., Bauer, J., Manning, C.D., Ng, A.Y.: Parsing with compositional vector grammars. In: Proceedings of the ACL Conference. Citeseer (2013)
Google Scholar
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
MATH Google Scholar
Bordes, A., Weston, J., Collobert, R., Bengio, Y.: Learning structured embeddings of knowledge bases. In: AAAI (2011)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Bhat, R.A., et al.: The Hindi/Urdu treebank project. In: Ide, N., Pustejovsky, J. (eds.) Handbook of Linguistic Annotation, pp. 659–697. Springer, Amsterdam (2017). https://doi.org/10.1007/978-94-024-0881-2_24
Chapter Google Scholar
Nivre, J.: Parsing Indian languages with MaltParser. In: Proceedings of the NLP Tools Contest: Indian Language Dependency Parsing, ICON 2009, pp. 12–18 (2009)
Google Scholar
Jain, S., Jain, N., Tammewar, A., Bhat, R.A., Sharma, D.M.: Exploring semantic information in Hindi wordnet for Hindi dependency parsing (2013)
Google Scholar
Ambati, B.R., Deoskar, T., Steedman, M.: Using CCG categories to improve Hindi dependency parsing. In: ACL, vol. 2, pp. 604–609 (2013)
Google Scholar
Faruqui, M., Kumar, S.: Multilingual open relation extraction using cross-lingual projection. arXiv preprint arXiv:1503.06450 (2015)
Palmer, M., Bhatt, R., Narasimhan, B., Rambow, O., Sharma, D.M., Xia, F.: Hindi syntax: annotating dependency, lexical predicate-argument structure, and phrase structure. In: The 7th International Conference on Natural Language Processing, pp. 14–17 (2009)
Google Scholar
Zhang, Y., Clark, S.: Syntactic processing using the generalized perceptron and beam search. Comput. Linguist. 37, 105–151 (2011)
Article Google Scholar
Narayan, D., Chakrabarti, D., Pande, P., Bhattacharyya, P.: An experience in building the indo wordnet-a wordnet for Hindi. In: First International Conference on Global WordNet, Mysore, India (2002)
Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)
MathSciNet MATH Google Scholar

Download references

Acknowledgement

Professor Sudeshna Sarkar acknowledges DEITY, Government of India for support under the ILMT Project.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur, Kharagpur, 721302, West Bengal, India
Agnivo Saha & Sudeshna Sarkar

Authors

Agnivo Saha
View author publications
You can also search for this author in PubMed Google Scholar
Sudeshna Sarkar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Agnivo Saha or Sudeshna Sarkar .

Editor information

Editors and Affiliations

CIC, Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saha, A., Sarkar, S. (2018). Enhancing Neural Network Based Dependency Parsing Using Morphological Information for Hindi. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science(), vol 9623. Springer, Cham. https://doi.org/10.1007/978-3-319-75477-2_26

Download citation

DOI: https://doi.org/10.1007/978-3-319-75477-2_26
Published: 21 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75476-5
Online ISBN: 978-3-319-75477-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics