Abstract
In Chap. 1, multi-document summarization is introduced as a potential solution to the information explosion problem. A major challenge in creating a summary from information extracted from multiple sources is to decide the order in which those information must be presented in the summary. Incorrect ordering of information selected from multiple sources would lead to misunderstandings. In this chapter, we discuss the challenges involved when ordering information selected from multiple sources and present several approaches to overcome those challenges. We also introduce several semi-automatic evaluation measures to empirically evaluate an ordering of sentences created by an algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Using the frequencies of words instead of the binary (0, 1) values as vector elements, did not have a positive impact in our experiments. We think this is because, compared to a document, a sentence typically has a lesser number of words, and a word does not appear many times in a single sentence.
- 2.
- 3.
- 4.
- 5.
- 6.
References
Barzilay, R., Lee, L.: Catching the drift: probabilistic content models, with applications to generation and summarization. In: HLT-NAACL 2004: Proceedings of the Main Conference, Boston, pp. 113–120 (2004)
Barzilay, R., Elhadad, N., McKeown, K.: Inferring strategies for sentence ordering in multidocument news summarization. J. Artif. Intell. Res. 17, 35–55 (2002)
Bollegala, D., Okazaki, N., Ishizuka, M.: A bottom-up approach to sentence ordering for multi-document summarization. Inf. Process. Manag. 46(1), 89–109 (2010)
Bos, J., Maekert, K.: Recognising textual entailment with logical inference. In: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing (HLT-EMNLP 2005), Vancouver, pp. 628–635 (2005)
Carbonell, J., Goldstein, J.: The use of mmr, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retreival, Melbourne, pp. 335–336 (1998)
Dagan, I., Glickman, O.: Probabilistic textual entailment: generic applied modeling of language variability. In: Proceedings of PASCAL Workshop on Learning Methods for Text Understanding and Mining, Grenoble (2004)
Duboue, P., McKeown, K.: Empirically estimating order constraints for content planning in generation. In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics (ACL’01), Toulouse, pp. 172–179 (2001)
Duboue, P., McKeown, K.: Content planner construction via evolutionary algorithms and a corpus-based fitness function. In: Proceedings of the Second International Natural Language Generation Conference (INLG’02), New York, pp. 89–96 (2002)
Elhadad, N., McKeown, K.: Towards generating patient specific summaries of medical articles. In: Proceedings of the NAACL 2001 Workshop on Automatic Summarization, Pittsburgh (2001)
Filatova, E., Hovy, E.: Assining time-stamps to event-clauses. In: Proceedings of the 2001 ACL Workshop on Temporal and Spatial Information Processing, Toulouse (2001)
Ji, P.D., Pulman, S.: Sentence ordering with manifold-based classification in multi-document summarization. In: Proceedings of Empherical Methods in Natural Language Processing, Sydney, pp. 526–533 (2006)
Karamanis, N., Manurung, H.M.: Stochastic text structuring using the principle of continuity. In: Proceedings of the Second International Natural Language Generation Conference (INLG’02). Columbia University, New York, pp. 81–88 (2002)
Kendall, M.G.: A new measure of rank correlation. Biometrika 30, 81–93 (1938)
Lapata, M.: Probabilistic text structuring: experiments with sentence ordering. In: Proceedings of the Annual Meeting of ACL 2003, Sapporo, pp. 545–552 (2003)
Lapata, M.: Automatic evaluation of information ordering. Comput. Linguist. 32(4), 471–484 (2006)
Lapata, M., Lascarides, A.: Learning sentence-internal temporal relations. J. Artif. Intell. Res. 27, 85–117 (2006)
Lin, C., Hovy, E.: Neats:a multidocument summarizer. In: Proceedings of the Document Understanding Workshop (DUC) (2001)
Mani, I., Wilson, G.: Robust temporal processing of news. In: Proceedings of the 38th Annual Meeting of ACL (ACL 2000), Hong Kong, pp. 69–76 (2000)
Mani, I., Schiffman, B., Zhang, J.: Inferring temporal ordering of events in news. In: Proceedings of North American Chapter of the ACL on Human Language Technology (HLT-NAACL 2003), Edmonton, pp. 55–57 (2003)
Mann, W., Thompson, S.: Rhetorical structure theory: toward a functional theory of text organization. Text 8(3), 243–281 (1988)
McKeown, K., Klavans, J., Hatzivassiloglou, V., Barzilay, R., Eskin, E.: Towards multidocument summarization by reformulation: progress and prospects. In: AAAI/IAAI, Orlando, pp. 453–460 (1999)
Okazaki, N., Matsuo, Y., Ishizuka, M.: Improving chronological sentence ordering by precedence relation. In: Proceedings of 20th International Conference on Computational Linguistics (COLING 04), Geneva, pp. 750–756 (2004)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, pp. 311–318 (2002)
Platt, J.: Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. In: Smola, J., et al. (eds.) Advances in Large Margin Classifiers, pp. 61–74. MIT, Cambridge (2000)
Radev, D.R., McKeown, K.: Generating natural language summaries from multiple on-line sources. Comput. Linguist. 24(3), 469–500 (1999)
Reiter, E., Dale, R.: Building Natural Language Generation Systems. Cambridge University Press, Cambridge/New York (2000)
Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast – but is it good? evaluating non-expert annotations for natural language tasks. In: EMNLP’08, Honolulu (2008)
Tukey, J.W.: Exploratory Data Analysis. Addison-Wesley, Reading (1977)
Xia, F., Liu, T.Y., Wang, J., Zhang, W., Li, H.: Listwise approach to learning to rank: theory and algorithm. In: ICML 2008, Helsinki, pp. 1192–1199 (2008)
Zanzotto, F.M., Moschitti, A.: Automatic learning of textual entailments with cross-pair similarities. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the ACL, Sydney, pp. 401–408 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Bollegala, D., Okazaki, N., Ishizuka, M. (2013). A Bottom-Up Approach to Sentence Ordering for Multi-Document Summarization. In: Poibeau, T., Saggion, H., Piskorski, J., Yangarber, R. (eds) Multi-source, Multilingual Information Extraction and Summarization. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28569-1_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-28569-1_12
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28568-4
Online ISBN: 978-3-642-28569-1
eBook Packages: Computer ScienceComputer Science (R0)