Abstract
This paper presents a machine learning approach to find and classify discourse relations between two unseen sentences. It describes the process of training a classifier that aims to determine (i) if there is any discourse relation among two sentences, and, if a relation is found, (ii) which is that relation. The final goal of this task is to insert discourse connectives between sentences seeking to enhance text cohesion of a summary produced by an extractive summarization system for the Portuguese language.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Biran, O., Rambow, O.: Identifying justifications in written dialogs by classifying text as argumentative. Int. J. Semantic Computing 5(4), 363–381 (2011)
Blair-Goldensohn, S., McKeown, K., Rambow, O.: Building and refining rhetorical-semantic relation models. In: Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, pp. 428–435. Association for Computational Linguistics, Rochester, Rochester (2007)
Feng, V.W., Hirst, G.: Text-level discourse parsing with rich linguistic features. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, ACL 2012, vol. 1, pp. 60–68. Association for Computational Linguistics, Stroudsburg (2012)
John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, UAI 1995, pp. 338–345. Morgan Kaufmann Publishers Inc., San Francisco (1995)
Lapata, M., Lascarides, A.: Inferring sentence-internal temporal relations. In: HLT-NAACL, pp. 153–160 (2004)
Lee, A., Prasad, R., Joshi, A., Dinesh, N.: Complexity of dependencies in discourse: Are dependencies in discourse more complex than in syntax? In: Proceedings of the 5th International Workshop on Treebanks and Linguistic Theories, Prague, Czech Republic, p. 12 (December 2006)
Lin, Z., Kan, M.Y., Ng, H.T.: Recognizing implicit discourse relations in the Penn Discourse Treebank. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009: Empirical Methods in Natural Language Processing, vol. 1, pp. 343–351. Association for Computational Linguistics, Stroudsburg (2009)
Louis, A., Joshi, A., Nenkova, A.: Discourse indicators for content selection in summarization. In: Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 147–156. Stroudsburg, PA, USA (2010)
Marcu, D., Echihabi, A.: An unsupervised approach to recognizing discourse relations. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL 2002, pp. 368–375. Association for Computational Linguistics, Stroudsburg (2002)
Park, J., Cardie, C.: Improving implicit discourse relation recognition through feature set optimization. In: Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGDIAL 2012, pp. 108–112. Association for Computational Linguistics, Stroudsburg (2012)
Pitler, E., Nenkova, A.: Using syntax to disambiguate explicit discourse connectives in text. In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, ACLShort 2009, pp. 13–16. Association for Computational Linguistics, Stroudsburg (2009)
Pitler, E., Raghupathy, M., Mehta, H., Nenkova, A., Lee, A., Joshi, A.: Easily identifiable discourse relations. In: Coling 2008: Companion Volume: Posters, pp. 87–90. Coling 2008 Organizing Committee, Manchester (2008)
Prasad, R., Dinesh, N., Lee, A., Miltsakaki, E., Robaldo, L., Joshi, A., Webber, B.: The Penn Discourse TreeBank 2.0. In: Proceedings of LREC (2008)
Prasad, R., Miltsakaki, E., Dinesh, N., Lee, A., Joshi, A., Robaldo, L., Webber, B.: The Penn Discourse Treebank 2.0 annotation manual. Tech. Rep. IRCS-08-01, Institute for Research in Cognitive Science, University of Pennsylvania (Dec 2007)
Quinlan, J.R.: Improved use of continuous attributes in C4.5. Journal of Artificial Intelligence Research 4(1), 77–90 (1996)
Rocha, P., Santos, D.: CETEMPúblico: Um corpus de grandes dimensões de linguagem jornalística portuguesa. In: 5th, pp. 131–140 (2000)
Silveira, S.B., Branco, A.: Combining a double clustering approach with sentence simplification to produce highly informative multi-document summaries. In: IRI 2012: 14th International Conference on Artificial Intelligence, Las Vegas, USA, pp. 482–489 (August 2012)
Vapnik, V.N.: The nature of statistical learning theory. Springer-Verlag New York, Inc., New York (1995)
Versley, Y.: Subgraph-based classification of explicit and implicit discourse relations. In: Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013) – Long Papers, pp. 264–275. Association for Computational Linguistics, Potsdam (2013)
Wellner, B., Pustejovsky, J., Havasi, C., Rumshisky, A., Saurí, R.: Classification of discourse coherence relations: an exploratory study using multiple knowledge sources. In: Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue, SigDIAL 2006, pp. 117–125. Association for Computational Linguistics, Stroudsburg (2006)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Silveira, S.B., Branco, A. (2014). Uncovering Discourse Relations to Insert Connectives between the Sentences of an Automatic Summary. In: Przepiórkowski, A., Ogrodniczuk, M. (eds) Advances in Natural Language Processing. NLP 2014. Lecture Notes in Computer Science(), vol 8686. Springer, Cham. https://doi.org/10.1007/978-3-319-10888-9_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-10888-9_26
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10887-2
Online ISBN: 978-3-319-10888-9
eBook Packages: Computer ScienceComputer Science (R0)