Text Summarization in Data Mining

Crangle, Colleen E.

doi:10.1007/3-540-46019-5_24

Text Summarization in Data Mining

Colleen E. Crangle⁵

Conference paper
First Online: 01 January 2002

402 Accesses
5 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2311))

Abstract

Text summarizers automatically construct summaries of a naturallanguage document. This paper examines the use of text summarization within data mining, identifying the potential summarizers have for uncovering interesting and unexpected information. It describes the current state of the art in commercial summarization and current approaches to the evaluation of summarizers. The paper then proposes a new model for text summarization and suggests a new form of evaluation. It argues that for summaries to be truly useful within data mining, they must include concepts abstracted from the text in addition to sentences extracted from the text. The paper uses two news articles to illustrate its points.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

H.P. Luhn. The automatic creation of literature abstracts. IBM Journal of Research and Development, 2 (2), 1958.
Google Scholar
Marti Hearst. Untangling Text Data Mining. Proceedings of ACL 99. 37th Annual Meeting of the Association for Computational Linguistics, University of Maryland, June 1999.
Google Scholar
Kathleen R. McKeown, et al. PERSIVAL, a System for Personalized Search and Summarization over Multimedia Healthcare Information, In Proceedings of The First ACM+IEEE Joint Conference on Digital Libraries. Roanoke, W. Va., June 2001.
Google Scholar
Don R. Swanson and N.R. Smalheiser. An interactive system for finding complementary literatures: a stimulus to scientific discovery. Artificial Intelligence, 91. 183–203 (1977)
Article Google Scholar
C.E. Crangle. What words mean: some considerations from the theory of definition in logic. Journal of Literary Semantics, Vol. XXI, No. 1, 17–26, 1992.
Article Google Scholar
C.E. Crangle and P. Suppes. Language and Learning for Robots. Stanford University, Stanford. CSLI Press. Distributed by Cambridge University Press, 1994
Google Scholar
Dragomir R. Radev, Hongyan Jing, and Malgorzata Budzikowska. Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies. Proceedings of the ANLP/NAACL-2000 Workshop on Automatic Summarization, pp. 21–30, Seattle, WA., 2000.
Google Scholar
Robert L. Donaway, Kevin K. Drummey, and Laura A. Mather. A Comparison of Rankings Produced by Summarization Evaluation Measures. Proceedings of the ANLP/NAACL-2000 Workshop on Automatic Summarization, pp. 69–78, Seattle, WA., May 2000.
Google Scholar

Download references

Author information

Authors and Affiliations

ConverSpeech LLC, 60 Kirby Place, 94301, California, Palo Alto, USA
Colleen E. Crangle

Authors

Colleen E. Crangle
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics School of Information and Software Engineering, University of Ulster, Jordanstown Campus, BT37 0QB, Newtownabbey, Northern Ireland
David Bustard , Weiru Liu & Roy Sterritt , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Crangle, C.E. (2002). Text Summarization in Data Mining. In: Bustard, D., Liu, W., Sterritt, R. (eds) Soft-Ware 2002: Computing in an Imperfect World. Soft-Ware 2002. Lecture Notes in Computer Science, vol 2311. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46019-5_24

Download citation

DOI: https://doi.org/10.1007/3-540-46019-5_24
Published: 10 April 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43481-8
Online ISBN: 978-3-540-46019-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics