Abstract
Hierarchical summarization technique summarizes a large document based on the hierarchical structure and salient features of the document. Previous study has shown that hierarchical summarization is a promising technique which can effectively extract the most important information from the source document. Hierarchical summarization has been extended to summarization of multiple documents. Three hierarchical structures were proposed to organize a set of related documents. This paper investigates the impact of document structure on hierarchical summarization. The results show that the hierarchical summarization of multiple documents organized in hierarchical structure outperforms other multi-document summarization systems without using the hierarchical structure. Moreover, the hierarchical summarization by event topics extracts a set of sentences significantly different from hierarchical summarization of other hierarchical structures and performs the best when the summary is highly-compressed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Edmundson, H.: New methods in automatic extraction. J. ACM 16(2), 264–285 (1968)
Endres-Niggemeyer, B., et al.: How to implement a naturalistic model of abstracting: four core working steps of an expert abstractor. Info. Proc. & Manag. 31(5), 631–674 (1995)
Kupiec, J., et al.: A trainable document summarizer. In: SIGIR 1995, pp. 68–73 (1995)
Luhn, H.P.: The automatic creation of literature abstracts. IBM J. R&D, 159–165 (1958)
Mani, I., et al.: The tipster SUMMAC text summarization evaluation. In: 9th conference on European chapter of the Association for Computation Linguistics (1999)
McKeown, K., et al.: Tracking and summarizing news on a daily basis with columbia’s newsblaster. In: Human Language Technology Conference (2002)
Mei, Q., et al.: Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In: ACM SIGKDD, pp. 198–207 (2005)
Morris, G., et al.: The effect and limitation of automated text condensing on reading comprehension performance. Information System Research, 17–35 (1992)
Nobata, C., et al.: A summarization system with categorization of document sets. In: Third NTCIR Workshop (2003)
Ou, S., et al.: Development and Evaluation of a Multi-document Summarization Method Focusing on Research Concepts and Their Research Relationships. In: Fox, E.A., Neuhold, E.J., Premsmit, P., Wuwongse, V. (eds.) ICADL 2005. LNCS, vol. 3815, pp. 283–292. Springer, Heidelberg (2005)
Salton, G., et al.: Term-weighting approaches in automatic text retrieval. Info. Proc. & Manag. 24, 513–523 (1988)
Teufel, S., et al.: Sentence extraction and rhetorical classification for flexible abstracts. In: AAAI 1998 Spring Sym., Stanford (1998)
Wang, F., et al.: Multi-document summarization for terrorism information extraction. In: IEEE ISI-2006 (2006)
Yang, Y., et al.: Learning approaches for detecting and tracking news events. Intelligent Information Retrieval, 32–43 (1999)
Yang, C., et al.: Fractal summarization: summarization based on fractal theory. In: SIGIR (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, F.L., Yang, C.C. (2006). Impact of Document Structure on Hierarchical Summarization. In: Sugimoto, S., Hunter, J., Rauber, A., Morishima, A. (eds) Digital Libraries: Achievements, Challenges and Opportunities. ICADL 2006. Lecture Notes in Computer Science, vol 4312. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11931584_49
Download citation
DOI: https://doi.org/10.1007/11931584_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49375-4
Online ISBN: 978-3-540-49377-8
eBook Packages: Computer ScienceComputer Science (R0)