Abstract
Measures for the degree of non-projectivity of dependency grammar have received attention both on the formal and on the empirical side. The empirical characterization of discontinuity in constituent treebanks annotated with crossing branches has nevertheless been neglected so far. In this paper, we present two measures for the characterization of both the discontinuity of constituent structures and the non-projectivity of dependency structures. An empirical evaluation on German data as well as an investigation of the relation between the measures and grammars extracted from treebanks shows their relevance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics 19(2), 313–330 (1994)
Civit, M., Martí Antònín, M.A.: Design principles for a Spanish treebank. In: Proceedings of the 1st Workshop on Treebanks and Linguistic Theories, Sozopol, Bulgaria (2002)
Telljohann, H., Hinrichs, E., Kübler, S., Zinsmeister, H.: Stylebook for the Tübingen Treebank of Written German (TüBa-D/Z). Technischer Bericht, Seminar für Sprachwissenschaft, Universität Tübingen, Tübingen (July 2006) Revidierte Fassung
Skut, W., Krenn, B., Brants, T., Uszkoreit, H.: An annotation scheme for free word order languages. In: Proceedings of the 5th Applied Natural Language Processing Conference, Washington, DC, pp. 88–95 (1997)
Brants, S., Dipper, S., Hansen, S., Lezius, W., Smith, G.: The TIGER Treebank. In: Proceedings of the 1st Workshop on Treebanks and Linguistic Theories, Sozopol, Bulgaria, pp. 24–42 (2002)
Kübler, S., Hinrichs, E.W., Maier, W.: Is it really that difficult to parse German? In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, pp. 111–119 (July 2006)
Boyd, A.: Discontinuity revisited: An improved conversion to context-free representations. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, the Linguistic Annotation Workshop, Prague, Czech Republic, pp. 41–44 (2007)
Kuhlmann, M.: Dependency Structures and Lexicalized Grammars. PhD thesis, Saarland University (2007)
Holan, T.: Kuboň, V., Oliva, K., Plátek, M.: Two useful measures of word order complexity. In: Workshop on Processing of Dependency-Based Grammars, Montréal, Canada, pp. 21–29 (1998)
Bodirsky, M., Kuhlmann, M., Möhl, M.: Well-nested drawings as models of syntactic structure. In: Proceedings of the 10th Conference on Formal Grammar and the 9th Meeting on Mathematics of Language (FG-MOL 2005), Edinburgh, UK (2005)
Kuhlmann, M., Satta, G.: Treebank grammar techniques for non-projective dependency parsing. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, Athens, Greece (2009)
Kunze, J.: Abhängigkeitsgrammatik. Studia grammatica, vol. 12. Akademie-Verlag, Berlin (1975)
Havelka, J.: Beyond projectivity: Multilingual evaluation of constraints and measures on non-projective structures. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 608–615 (2007)
Kuhlmann, M., Nivre, J.: Mildly non-projective dependency structures. In: Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, Sydney, Australia (2006)
Gómez-Rodríguez, C., Weir, D., Carroll, J.: Parsing mildly non-projective dependency structures. In: Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009), Athens, Greece, pp. 291–299. Association for Computational Linguistics (March 2009)
Vijay-Shanker, K., Weir, D., Joshi, A.: Characterising structural descriptions used by various formalisms. In: Proceedings of ACL (1987)
Boullier, P.: Proposal for a natural language processing syntactic backbone. Rapport de Recherche RR-3342, Institut National de Recherche en Informatique et en Automatique, Le Chesnay, France (1998)
Maier, W., Søgaard, A.: Treebanks and mild context-sensitivity. In: Proceedings of the 13th Conference on Formal Grammar 2008, Hamburg, Germany, pp. 61–76 (2008)
Kracht, M.: The Mathematics of Language. Mouton de Gruyter, Berlin (2003)
Hajič, J., Hladka, B.V., Panevová, J., Hajičová, E., Sgall, P., Pajas, P.: Prague Dependency Treebank 1.0. LDC (2001) 2001T10
Kromann, M.T.: The Danish Dependency Treebank and the DTAG treebank tool. In: Second Workshop on Treebanks and Linguistic Theories, Växjö, Sweden, pp. 217–220 (2003)
Daum, M., Foth, K., Menzel, W.: Automatic transformation of phrase treebanks to dependency trees. In: Proceedings of the 4th International Conference on Language Resources and Evaluation, Lisbon, Portugal (2004)
Forst, M., Bertomeu, N., Crysmann, B., Fouvry, F., Hansen-Schirra, S., Kordoni, V.: Towards a dependency-based gold standard for German parsers: The TiGer Dependency Bank. In: Proceedings of LINC 2004, Geneva, Switzerland (2004)
Hudson, R.: Word Grammar. Basil Blackwell, Oxford (1984)
Engel, U.: Deutsche Grammatik. Groos, Heidelberg (1988)
Lobin, H.: Koordinationssyntax als prozedurales Phänomen. Studien zur deutschen Grammatik, vol. 46. Narr, Tübingen (1993)
Osenova, P., Simov, K.: BTB-TR05: BulTreebank Stylebook. Technical Report 05, BulTreeBank Project (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Maier, W., Lichte, T. (2011). Characterizing Discontinuity in Constituent Treebanks. In: de Groote, P., Egg, M., Kallmeyer, L. (eds) Formal Grammar. FG 2009. Lecture Notes in Computer Science(), vol 5591. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20169-1_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-20169-1_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20168-4
Online ISBN: 978-3-642-20169-1
eBook Packages: Computer ScienceComputer Science (R0)