Abstract
The existing discourse parsing systems make use of different theories to put at the basis of processes of building discourse trees. Many of them use Recall, Precision and F-measure to compare discourse tree structures. These measures can be used only on topologically identical structures. However, there are known cases when two different tree structures of the same text can express the same discourse interpretation, or something very similar. In these cases Precision, Recall and F-measures are not so conclusive. In this paper, we propose three new scores for comparing discourse trees. These scores take into consideration more and more constraints. As basic elements of building the discourse structure we use those embraced by two discourse theories: Rhetorical Structure Theory (RST) and Veins Theory, both using binary trees augmented with nuclearity notation. We will ignore the second notation used in RST – the name of relations. The first score takes into account the coverage of inner nodes. The second score complements the first score with the nuclearity of the relation. The third score computes Precisions, Recall and F-measures on the vein expressions of the elementary discourse units. We show that these measures reveal comparable scores there where the differences in structure are not doubled by differences in interpretation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Roark, B., Harper, M., Charniak, E., Dorr, B., Johnson, M., Kahne, J.G., Liuf, Y., Ostendorf, M., Hale, J., Krasnyanskaya, A., Lease, M., Shafran, I., Snover, M., Stewart, R., Yung, L.: SParseval: Evaluation metrics for parsing speech. In: Proceedings of LREC (2006)
Marcu., D.: The theory and Practice of Discourse Parsing and Summarization. MIT press (2000)
Soricut, R., Marcu, D.: Sentence Level Discourse Parsing using Syntactic and Lexical Information. In: Proceedings of the Human Language Technology and North American Association for Computational Linguistics Conference, Edmonton, pp. 149–156 (2003)
Hernault, H., Prendinger, H., duVerle, D., Ishizuka, M.: HILDA: A Discourse Parser Using Support Vector Machine Classification. Dialogue and Discourse, pp. 1–33 (2010)
Reitter, D.: Simple signals for complex rhetorics: On rhetorical analysis with rich-features support vector models. LDV-Forum. GLDV-Journal for Computational Linguistics and Language Technology, 38–52 (2003)
Baldridge, J., Lascarides, A.: Probabilistic head-driven parsing for discourse structure. In: Proceedings of the Ninth Conference on Computational Natural Language Learning, Ann Arbor, Michigan, pp. 96–103 (2005)
Mann, W.C., Thompson, S.A.: Rhetorical Structure Theory: Toward a functional theory of text organization. Text, 243–281 (1988)
Taboada, M., Mann, W.C.: Rhetorical Structure Theory: looking back and moving ahead. Discourse Studies, 423–459 (2006)
Cristea, D., Ide, N., Romary, L.: Veins theory: A model of global discourse cohesion and coherence. In: Proceedings of the 17th International Conference on Computational Linguistics, Montreal, pp. 281–285 (1998)
Davis, T.: Catalan Numbers, http://www.geometer.org/mathcircles
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mitocariu, E., Anechitei, D.A., Cristea, D. (2013). Comparing Discourse Tree Structures. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2013. Lecture Notes in Computer Science, vol 7816. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37247-6_41
Download citation
DOI: https://doi.org/10.1007/978-3-642-37247-6_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37246-9
Online ISBN: 978-3-642-37247-6
eBook Packages: Computer ScienceComputer Science (R0)