Spotting Topics with the Singular Value Decomposition

Nicholas, Charles; Dahlberg, Randall

doi:10.1007/3-540-49654-8_7

Charles Nicholas⁷ &
Randall Dahlberg⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1481))

Included in the following conference series:

International Workshop on Principles of Digital Document Processing

185 Accesses
3 Altmetric

Abstract

The singular value decomposition, or SVD, has been studied in the past as a tool for detecting and understanding patterns in a collection of documents. We show how the matrices produced by the SVD calculation can be interpreted, allowing us to spot patterns of characters that indicate particular topics in a corpus. A test collection, consisting of two days of AP newswire traffic, is used as a running example.

Contact author: Charles Nicholas, Department of Computer Science and Electrical Engineering, UMBC, 1000 Hilltop Circle, Baltimore, MD 21250 USA, 410-455-2594, -3969 (fax), nicholas@cs.umbc.edu

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Principle component analysis: Robust versions

Article 11 March 2017

Revisiting the Past to Reinvent the Future: Topic Modeling with Single Mode Factorization

Principal Component Analysis

References

Michael Berry. Large scale singular value calculations. International Journal of Supercomputer Applications, 6:13–49, 1992.
Google Scholar
Michael Berry, Susan Dumais, and Gavin O’Brien. Using linear algebra for intelligent information retrieval. SIAM Review, 37(4):573–595, December 1995.
Article MATH MathSciNet Google Scholar
M. Damashek. Gauging similarity with n-grams: Language-independent categorization of text. Science, 267:843–848, 10 February 1995.
Article Google Scholar
Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, and Richard Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41:391–407, 1990.
Article Google Scholar
Susan Dumais. Improving the retrieval of information from external sources. Behavior Research Methods, Instruments & Computers, 23(2):229–236, 1991.
Google Scholar
Donna Harman. Overview of the Fourth Text REtrieval Conference (TREC-4). National Institute of Standards and Technology, 1995.
Google Scholar
Bradley Kjell and Ophir Frieder. Visualization of literary style. In IEEE International Conference on Systems, Man and Cybernetics, pages 656–661. IEEE, 18–21 October 1992.
Google Scholar
Thomas Landauer and Michael Littman. Computerized cross-language document retrieval using latent semantic indexing. United States Patent 5,301,109, 5 April 1994.
Google Scholar

Download references

Author information

Authors and Affiliations

University of Maryland Baltimore County, Baltimore
Charles Nicholas
U.S. Department of Defense, USA
Randall Dahlberg

Authors

Charles Nicholas
View author publications
You can also search for this author in PubMed Google Scholar
Randall Dahlberg
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electrical Engineering and Computer Science, University of Wisconsin-Milwaukee, Milwaukee, WI, 53211, USA
Ethan V. Munson
Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, 1000 Hilltop Circle, Baltimore, MD, 21250, USA
Charles Nicholas
Department of Computer Science, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong SAR
Derick Wood

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nicholas, C., Dahlberg, R. (1998). Spotting Topics with the Singular Value Decomposition. In: Munson, E.V., Nicholas, C., Wood, D. (eds) Principles of Digital Document Processing. PODDP 1998. Lecture Notes in Computer Science, vol 1481. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49654-8_7

Download citation

DOI: https://doi.org/10.1007/3-540-49654-8_7
Published: 15 September 2000
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65086-7
Online ISBN: 978-3-540-49654-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics