Correlating Natural Language Parser Performance with Statistical Measures of the Text

Zhang, Yi; Wang, Rui

doi:10.1007/978-3-642-04617-9_28

Yi Zhang²⁰ &
Rui Wang²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5803))

Included in the following conference series:

Annual Conference on Artificial Intelligence

1650 Accesses
1 Citations

Abstract

Natural language parsing, as one of the central tasks in natural language processing, is widely used in many AI fields. In this paper, we address an issue of parser performance evaluation, particularly its variation across datasets. We propose three simple statistical measures to characterize the datasets and also evaluate their correlation to the parser performance. The results clearly show that different parsers have different performance variation and sensitivity against these measures. The method can be used to guide the choice of natural language parsers for new domain applications, as well as systematic combination for better parsing accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Evaluation of the Dependency Parser

Combining Dependency Parsers Using Error Rates

Maximum Entropy Models for Natural Language Processing

References

Miyao, Y., Sagae, K., Sætre, R., Matsuzaki, T., Tsujii, J.: Evaluating Contributions of Natural Language Parsers to Protein-Protein Interaction Extraction. Journal of Bioinformatics 25(3), 394–400 (2009)
Article Google Scholar
Moore, R.K.: Spoken language processing: piecing together the puzzle. Speech Communication: Special Issue on Bridging the Gap Between Human and Automatic Speech Processing 49, 418–435 (2007)
Article Google Scholar
Bacchiani, M., Riley, M., Roark, B., Sproat, R.: Map adaptation of stochastic grammars. Computer speech and language 20(1), 41–68 (2006)
Article Google Scholar
McClosky, D., Charniak, E., Johnson, M.: Reranking and self-training for parser adaptation. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, pp. 337–344 (2006)
Google Scholar
McClosky, D., Charniak, E., Johnson, M.: When is self-training effective for parsing? In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), Manchester, UK, pp. 561–568 (2008)
Google Scholar
Nivre, J., Hall, J., Kübler, S., McDonald, R., Nilsson, J., Riedel, S., Yuret, D.: The CoNLL 2007 shared task on dependency parsing. In: Proceedings of EMNLP-CoNLL 2007, Prague, Czech Republic, pp. 915–932 (2007)
Google Scholar
Hara, T., Miyao, Y., Tsujii, J.: Adapting a probabilistic disambiguation model of an HPSG parser to a new domain. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 199–210. Springer, Heidelberg (2005)
Chapter Google Scholar
Rimell, L., Clark, S.: Porting a Lexicalized-Grammar Parser to the Biomedical Domain. Journal of Biomedical Informatics (in press, 2009)
Google Scholar
Plank, B.: Structural Correspondence Learning for Parse Disambiguation. In: Proceedings of the Student Research Workshop at EACL 2009, Athens, Greece, pp. 37–45 (2009)
Google Scholar
Dredze, M., Blitzer, J., Pratim Talukdar, P., Ganchev, K., Graca, J.a., Pereira, F.: Frustratingly hard domain adaptation for dependency parsing. In: Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007. Association for Computational Linguistics, Prague, June 2007, pp. 1051–1055 (2007)
Google Scholar
Bikel, D.M.: Intricacies of Collins’ parsing model. Computational Linguistics 30, 479–511 (2004)
Article MATH Google Scholar
Collins, M.: Three Generative, Lexicalised Models for Statistical Parsing. In: Proceedings of the 35th annual meeting of the association for computational linguistics, Madrid, Spain, pp. 16–23 (1997)
Google Scholar
Klein, D., Manning, C.D.: Accurate Unlexicalized Parsing. In: Proceedings of the 41st Meeting of the Association for Computational Linguistics, Sapporo, Japan, pp. 423–430 (2003)
Google Scholar
McDonald, R., Pereira, F., Ribarov, K., Hajic, J.: Non-Projective Dependency Parsing using Spanning Tree Algorithms. In: Proceedings of HLT-EMNLP 2005, Vancouver, Canada, pp. 523–530 (2005)
Google Scholar
Nivre, J., Nilsson, J., Hall, J., Chanev, A., Eryigit, G., Kübler, S., Marinov, S., Marsi, E.: Maltparser: A language-independent system for data-driven dependency parsing. Natural Language Engineering 13(1), 1–41 (2007)
Article Google Scholar
Flickinger, D.: On building a more efficient grammar by exploiting types. In: Oepen, S., Flickinger, D., Tsujii, J., Uszkoreit, H. (eds.) Collaborative Language Engineering, pp. 1–17. CSLI Publications, Stanford (2002)
Google Scholar
Callmeier, U.: Efficient parsing with large-scale unification grammars. Master’s thesis, Universität des Saarlandes, Saarbrücken, Germany (2001)
Google Scholar
Baldwin, T., Bender, E.M., Flickinger, D., Kim, A., Oepen, S.: Road-testing the English Resource Grammar over the British National Corpus. In: Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, Portugal (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

LT-Lab, German Research Center for Artificial Intelligence and Computational Linguistics, Saarland University, Germany
Yi Zhang
Computational Linguistics, Saarland University, Germany
Rui Wang

Authors

Yi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Rui Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

GET Lab, University of Paderborn, Pohlweg 47-49, 33098, Paderborn, Germany
Bärbel Mertsching , Marcus Hund & Zaheer Aziz , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Y., Wang, R. (2009). Correlating Natural Language Parser Performance with Statistical Measures of the Text. In: Mertsching, B., Hund, M., Aziz, Z. (eds) KI 2009: Advances in Artificial Intelligence. KI 2009. Lecture Notes in Computer Science(), vol 5803. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04617-9_28

Download citation

DOI: https://doi.org/10.1007/978-3-642-04617-9_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04616-2
Online ISBN: 978-3-642-04617-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics