Skip to main content

Spotting Topics with the Singular Value Decomposition

  • Conference paper
  • First Online:
Principles of Digital Document Processing (PODDP 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1481))

Included in the following conference series:

Abstract

The singular value decomposition, or SVD, has been studied in the past as a tool for detecting and understanding patterns in a collection of documents. We show how the matrices produced by the SVD calculation can be interpreted, allowing us to spot patterns of characters that indicate particular topics in a corpus. A test collection, consisting of two days of AP newswire traffic, is used as a running example.

Contact author: Charles Nicholas, Department of Computer Science and Electrical Engineering, UMBC, 1000 Hilltop Circle, Baltimore, MD 21250 USA, 410-455-2594, -3969 (fax), nicholas@cs.umbc.edu

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Michael Berry. Large scale singular value calculations. International Journal of Supercomputer Applications, 6:13–49, 1992.

    Google Scholar 

  2. Michael Berry, Susan Dumais, and Gavin O’Brien. Using linear algebra for intelligent information retrieval. SIAM Review, 37(4):573–595, December 1995.

    Article  MATH  MathSciNet  Google Scholar 

  3. M. Damashek. Gauging similarity with n-grams: Language-independent categorization of text. Science, 267:843–848, 10 February 1995.

    Article  Google Scholar 

  4. Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, and Richard Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41:391–407, 1990.

    Article  Google Scholar 

  5. Susan Dumais. Improving the retrieval of information from external sources. Behavior Research Methods, Instruments & Computers, 23(2):229–236, 1991.

    Google Scholar 

  6. Donna Harman. Overview of the Fourth Text REtrieval Conference (TREC-4). National Institute of Standards and Technology, 1995.

    Google Scholar 

  7. Bradley Kjell and Ophir Frieder. Visualization of literary style. In IEEE International Conference on Systems, Man and Cybernetics, pages 656–661. IEEE, 18–21 October 1992.

    Google Scholar 

  8. Thomas Landauer and Michael Littman. Computerized cross-language document retrieval using latent semantic indexing. United States Patent 5,301,109, 5 April 1994.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nicholas, C., Dahlberg, R. (1998). Spotting Topics with the Singular Value Decomposition. In: Munson, E.V., Nicholas, C., Wood, D. (eds) Principles of Digital Document Processing. PODDP 1998. Lecture Notes in Computer Science, vol 1481. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49654-8_7

Download citation

  • DOI: https://doi.org/10.1007/3-540-49654-8_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65086-7

  • Online ISBN: 978-3-540-49654-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics