Abstract
In the paper, we present the results of an experiment with comparing the effectiveness of real text parsers of Czech language based on completely different approaches – stochastic parsers that provide dependency trees as their outputs and a meta-grammar parser that generates a resulting chart structure representing a packed forest of phrasal derivation trees.
We describe and formulate main questions and problems accompanying such experiment, try to offer answers to these questions and finally display also factual results of the tests measured on 10 thousand Czech sentences.
This work has been partly supported by the Academy of Sciences of Czech Republic under the projects T100300414, T100300419 and 1ET100300517 and by the Ministry of Education of CR within the Center of basic research LC536 and by the Czech Science Foundation under the project 201/05/2781.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hajič, J.: Building a syntactically annotated corpus: The Prague Dependency Treebank. In: Issues of Valency and Meaning, Prague, Karolinum, pp. 106–132 (1998)
Horák, A., Kadlec, V.: New meta-grammar constructs in Czech language parser synt. In: Matoušek, V., Mautner, P., Pavelka, T. (eds.) TSD 2005. LNCS (LNAI), vol. 3658, pp. 85–92. Springer, Heidelberg (2005)
McDonald, R.: Discriminative learning and spanning tree algorithms for dependency parsing. PhD thesis, University of Pennsylvania (2006)
Hajič, J., Collins, M., Ramshaw, L., Tillmann, C.: A Statistical Parser for Czech. In: Proceedings ACL 1999, Maryland, USA (1999)
Holan, T., Žabokrtský, Z.: Combining Czech Dependency Parsers. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 95–102. Springer, Heidelberg (2006)
Holan, T.: Genetické učení závislostních analyzátorů. In: Sborník semináře ITAT 2005. UPJŠ, Košice (2005)
Holan, T.: Tvorba závislostního syntaktického analyzátoru. In: Wiil, U.K. (ed.) MIS 2004. LNCS, vol. 3511, Springer, Heidelberg (2005)
Hall, K., Novák, V.: Corrective modeling for non-projective dependency parsing, 42–51 (2005)
Nilsson, J., Nivre, J., Hall, J.: Graph transformations in data-driven dependency parsing. In: Proceedings of the 21st Conference on Computational Linguistics and 44th Annual Meeting of the ACL, Sydney, pp. 257–264 (2006)
Horák, A., Kadlec, V., Smrž, P.: Enhancing best analysis selection and parser comparison. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2002. LNCS (LNAI), vol. 2448, pp. 461–467. Springer, Heidelberg (2002)
Sedláček, R.: Morphemic Analyser for Czech. PhD thesis, Masaryk University (2005)
Hajič, J.: Disambiguation of Rich Inflection (Computational Morphology of Czech). Karolinum, Charles University Press, Prague, Czech Republic (2004)
Horák, A., Smrž, P.: Best analysis selection in inflectional languages. In: Proceedings of the 19th international conference on Computational linguistics, Taipei, Taiwan, Association for Computational Linguistics, pp. 363–368 (2002)
Hajič, J.: Complex Corpus Annotation: The Prague Dependency Treebank, Bratislava, Slovakia, Jazykovedný ústav Ľ. Štúra, SAV (2004)
Collins, M.: dep2phr – conversion between dependency and phrase structures (1998), http://ufal.mff.cuni.cz/pdt/Utilities/dep2phr/
Bangalore, S., Sarkar, A., Doran, C., Hockey, B.A.: Grammar & parser evaluation in the XTAG project (1998), http://www.cs.sfu.ca/~anoop/papers/pdf/eval-final.pdf
Sampson, G.: A Proposal for Improving the Measurement of Parse Accuracy. International Journal of Corpus Linguistics 5(01), 53–68 (2000)
Sampson, G., Babarczy, A.: A test of the leaf-ancestor metric for parse accuracy. Natural Language Engineering 9(04), 365–380 (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Horák, A., Holan, T., Kadlec, V., Kovář, V. (2007). Dependency and Phrasal Parsers of the Czech Language: A Comparison. In: Matoušek, V., Mautner, P. (eds) Text, Speech and Dialogue. TSD 2007. Lecture Notes in Computer Science(), vol 4629. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74628-7_12
Download citation
DOI: https://doi.org/10.1007/978-3-540-74628-7_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74627-0
Online ISBN: 978-3-540-74628-7
eBook Packages: Computer ScienceComputer Science (R0)