Abstract
Text summarizers automatically construct summaries of a naturallanguage document. This paper examines the use of text summarization within data mining, identifying the potential summarizers have for uncovering interesting and unexpected information. It describes the current state of the art in commercial summarization and current approaches to the evaluation of summarizers. The paper then proposes a new model for text summarization and suggests a new form of evaluation. It argues that for summaries to be truly useful within data mining, they must include concepts abstracted from the text in addition to sentences extracted from the text. The paper uses two news articles to illustrate its points.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
H.P. Luhn. The automatic creation of literature abstracts. IBM Journal of Research and Development, 2 (2), 1958.
Marti Hearst. Untangling Text Data Mining. Proceedings of ACL 99. 37th Annual Meeting of the Association for Computational Linguistics, University of Maryland, June 1999.
Kathleen R. McKeown, et al. PERSIVAL, a System for Personalized Search and Summarization over Multimedia Healthcare Information, In Proceedings of The First ACM+IEEE Joint Conference on Digital Libraries. Roanoke, W. Va., June 2001.
Don R. Swanson and N.R. Smalheiser. An interactive system for finding complementary literatures: a stimulus to scientific discovery. Artificial Intelligence, 91. 183–203 (1977)
C.E. Crangle. What words mean: some considerations from the theory of definition in logic. Journal of Literary Semantics, Vol. XXI, No. 1, 17–26, 1992.
C.E. Crangle and P. Suppes. Language and Learning for Robots. Stanford University, Stanford. CSLI Press. Distributed by Cambridge University Press, 1994
Dragomir R. Radev, Hongyan Jing, and Malgorzata Budzikowska. Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies. Proceedings of the ANLP/NAACL-2000 Workshop on Automatic Summarization, pp. 21–30, Seattle, WA., 2000.
Robert L. Donaway, Kevin K. Drummey, and Laura A. Mather. A Comparison of Rankings Produced by Summarization Evaluation Measures. Proceedings of the ANLP/NAACL-2000 Workshop on Automatic Summarization, pp. 69–78, Seattle, WA., May 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Crangle, C.E. (2002). Text Summarization in Data Mining. In: Bustard, D., Liu, W., Sterritt, R. (eds) Soft-Ware 2002: Computing in an Imperfect World. Soft-Ware 2002. Lecture Notes in Computer Science, vol 2311. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46019-5_24
Download citation
DOI: https://doi.org/10.1007/3-540-46019-5_24
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43481-8
Online ISBN: 978-3-540-46019-0
eBook Packages: Springer Book Archive