Abstract
Syntactic analysis of natural languages is the fundamental requirement of many applied tasks. We propose a new module between morphological and syntactic analysis that aims at determining the overall structure of a sentence prior to its complete analysis.
We exploit a concept of segments, easily automatically detectable and linguistically motivated units. The output of the module, so-called ‘segmentation chart’, describes the relationship among segments, especially relations of coordination and apposition or relation of subordination.
In this text we present a framework that enables us to develop and test rules for automatic identification of segmentation charts. We describe two basic experiments – an experiment with segmentation patterns obtained from the Prague Dependency Treebank and an experiment with the segmentation rules applied to plain text. Further, we discuss the evaluation measures suitable for our task.
This paper presents the results of the project supported by the GAČR grant No. 405/08/0681 and partially also by the IS program No. 1ET100300517. The research is carried out within the project of MŠMT No. MSM0021620838.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Holan, T.: O složitosti Vesmíru. In: Obdržálek, D., Štanclová, J., Plátek, M. (eds.) Malý informatický seminář MIS 2007, pp. 44–47. MatFyz Press, Praha (2007)
Abney, S.: Parsing By Chunks. In: Berwick, R., Abney, S., Tenny, C. (eds.) Principle-Based Parsing, pp. 257–278. Kluwer Academic Publishers, Dordrecht (1991)
Abney, S.: Partial Parsing via Finite-State Cascades. Journal of Natural Language Engineering 2, 337–344 (1995)
Brants, T.: Cascaded Markov Models. In: Proceedings of EACL 1999, pp. 118–125. University of Bergen (1999)
Ciravegna, F., Lavelli, A.: Full Text Parsing using Cascades of Rules: An Information Extraction Procedure. In: Proceedings of EACL 1999, pp. 102–109. University of Bergen (1999)
Kuboň, V.: Problems of Robust Parsing of Czech. Ph.D. Thesis, MFF UK, Prague (2001)
Kuboň, V., Lopatková, M., Plátek, M., Pognan, P.: A Linguistically-Based Segmentation of Complex Sentences. In: Wilson, D.C., Sutcliffe, G.C.J. (eds.) Proceedings of FLAIRS Conference, pp. 368–374. AAAI Press, Menlo Park (2007)
Pardubská, D., Plátek, M.: On Parallel Communicating Grammar Systems and Correctness Preserving Restarting Automata. In: Dediu, A.H., Ionescu, A.M., Martín-Vide, C. (eds.) LATA 2009. LNCS, vol. 5457, pp. 1–18. Springer, Heidelberg (2009)
Jones, B.E.M.: Exploiting the Role of Punctuation in Parsing Natural Text. In: Proceedings of the COLING 1994, Kyoto, pp. 421–425 (1994)
Ohno, T., Matsubara, S., Kashioka, H., Maruyama, T., Inagaki, Y.: Dependency Parsing of Japanese Spoken Monologue Based on Clause Boundaries. In: Proceedings of COLING and ACL, pp. 169–176 (2006)
Hajič, J.: Disambiguation of Rich Inflection (Computational Morphology of Czech), UK, Nakladatelství Karolinum, Praha (2004)
Spoustová, D., Hajič, J., Votrubec, J., Krbec, P., Květoň, P.: The Best of Two Worlds: Cooperation of Statistical and Rule-Based Taggers for Czech. In: Proceedings of Balto-Slavonic NLP Workshop, pp. 67–74. ACL, Prague (2007)
Hajič, J., Hajičová, E., Panevová, J., Sgall, P., Pajas, P., Štěpánek, J., Havelka, J., Mikulová, M.: Prague Dependency Treebank 2.0. LDC, Philadelphia (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lopatková, M., Holan, T. (2009). Segmentation Charts for Czech – Relations among Segments in Complex Sentences. In: Dediu, A.H., Ionescu, A.M., Martín-Vide, C. (eds) Language and Automata Theory and Applications. LATA 2009. Lecture Notes in Computer Science, vol 5457. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00982-2_46
Download citation
DOI: https://doi.org/10.1007/978-3-642-00982-2_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00981-5
Online ISBN: 978-3-642-00982-2
eBook Packages: Computer ScienceComputer Science (R0)