Skip to main content

Correlating Natural Language Parser Performance with Statistical Measures of the Text

  • Conference paper
KI 2009: Advances in Artificial Intelligence (KI 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5803))

Included in the following conference series:

Abstract

Natural language parsing, as one of the central tasks in natural language processing, is widely used in many AI fields. In this paper, we address an issue of parser performance evaluation, particularly its variation across datasets. We propose three simple statistical measures to characterize the datasets and also evaluate their correlation to the parser performance. The results clearly show that different parsers have different performance variation and sensitivity against these measures. The method can be used to guide the choice of natural language parsers for new domain applications, as well as systematic combination for better parsing accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Miyao, Y., Sagae, K., Sætre, R., Matsuzaki, T., Tsujii, J.: Evaluating Contributions of Natural Language Parsers to Protein-Protein Interaction Extraction. Journal of Bioinformatics 25(3), 394–400 (2009)

    Article  Google Scholar 

  2. Moore, R.K.: Spoken language processing: piecing together the puzzle. Speech Communication: Special Issue on Bridging the Gap Between Human and Automatic Speech Processing 49, 418–435 (2007)

    Article  Google Scholar 

  3. Bacchiani, M., Riley, M., Roark, B., Sproat, R.: Map adaptation of stochastic grammars. Computer speech and language 20(1), 41–68 (2006)

    Article  Google Scholar 

  4. McClosky, D., Charniak, E., Johnson, M.: Reranking and self-training for parser adaptation. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, pp. 337–344 (2006)

    Google Scholar 

  5. McClosky, D., Charniak, E., Johnson, M.: When is self-training effective for parsing? In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), Manchester, UK, pp. 561–568 (2008)

    Google Scholar 

  6. Nivre, J., Hall, J., Kübler, S., McDonald, R., Nilsson, J., Riedel, S., Yuret, D.: The CoNLL 2007 shared task on dependency parsing. In: Proceedings of EMNLP-CoNLL 2007, Prague, Czech Republic, pp. 915–932 (2007)

    Google Scholar 

  7. Hara, T., Miyao, Y., Tsujii, J.: Adapting a probabilistic disambiguation model of an HPSG parser to a new domain. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 199–210. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  8. Rimell, L., Clark, S.: Porting a Lexicalized-Grammar Parser to the Biomedical Domain. Journal of Biomedical Informatics (in press, 2009)

    Google Scholar 

  9. Plank, B.: Structural Correspondence Learning for Parse Disambiguation. In: Proceedings of the Student Research Workshop at EACL 2009, Athens, Greece, pp. 37–45 (2009)

    Google Scholar 

  10. Dredze, M., Blitzer, J., Pratim Talukdar, P., Ganchev, K., Graca, J.a., Pereira, F.: Frustratingly hard domain adaptation for dependency parsing. In: Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007. Association for Computational Linguistics, Prague, June 2007, pp. 1051–1055 (2007)

    Google Scholar 

  11. Bikel, D.M.: Intricacies of Collins’ parsing model. Computational Linguistics 30, 479–511 (2004)

    Article  MATH  Google Scholar 

  12. Collins, M.: Three Generative, Lexicalised Models for Statistical Parsing. In: Proceedings of the 35th annual meeting of the association for computational linguistics, Madrid, Spain, pp. 16–23 (1997)

    Google Scholar 

  13. Klein, D., Manning, C.D.: Accurate Unlexicalized Parsing. In: Proceedings of the 41st Meeting of the Association for Computational Linguistics, Sapporo, Japan, pp. 423–430 (2003)

    Google Scholar 

  14. McDonald, R., Pereira, F., Ribarov, K., Hajic, J.: Non-Projective Dependency Parsing using Spanning Tree Algorithms. In: Proceedings of HLT-EMNLP 2005, Vancouver, Canada, pp. 523–530 (2005)

    Google Scholar 

  15. Nivre, J., Nilsson, J., Hall, J., Chanev, A., Eryigit, G., Kübler, S., Marinov, S., Marsi, E.: Maltparser: A language-independent system for data-driven dependency parsing. Natural Language Engineering 13(1), 1–41 (2007)

    Article  Google Scholar 

  16. Flickinger, D.: On building a more efficient grammar by exploiting types. In: Oepen, S., Flickinger, D., Tsujii, J., Uszkoreit, H. (eds.) Collaborative Language Engineering, pp. 1–17. CSLI Publications, Stanford (2002)

    Google Scholar 

  17. Callmeier, U.: Efficient parsing with large-scale unification grammars. Master’s thesis, Universität des Saarlandes, Saarbrücken, Germany (2001)

    Google Scholar 

  18. Baldwin, T., Bender, E.M., Flickinger, D., Kim, A., Oepen, S.: Road-testing the English Resource Grammar over the British National Corpus. In: Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, Portugal (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, Y., Wang, R. (2009). Correlating Natural Language Parser Performance with Statistical Measures of the Text. In: Mertsching, B., Hund, M., Aziz, Z. (eds) KI 2009: Advances in Artificial Intelligence. KI 2009. Lecture Notes in Computer Science(), vol 5803. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04617-9_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04617-9_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04616-2

  • Online ISBN: 978-3-642-04617-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics