skip to main content
10.1145/383952.384038acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Unitary operators for fast latent semantic indexing (FLSI)

Authors Info & Claims
Published:01 September 2001Publication History

ABSTRACT

Latent Semantic Indexing (LSI) dramatically reduces the dimension of the document space by mapping it into a space spanned by conceptual indices. Empirically, the number of concepts that can represent the documents are far fewer than the great variety of words in the textual representation. Although this almost obviates the problem of lexical matching, the mapping incurs a high computational cost compared to document parsing, indexing, query matching, and updating. This paper shows how LSI is based on a unitary transformation, for which there are computationally more attractive alternatives. This is exemplified by the Haar transform, which is memory efficient, and can be computed in linear to sublinear time. The principle advantages of LSI are thus preserved while the computational costs are drastically reduced.

References

  1. 1.M.Berry,S.Dumais,and G.O 'Brien.Lowrank orthogonal decompositions for information retrieval applications.SIAM Review 37(4):573 -59,1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2.M.Berry and R.Fierro.Low-rank orthogonal decomposition for information retrieval applications. Numerical Linear Algebra with Applications 1(1):1 -27, 1996.Google ScholarGoogle Scholar
  3. 3.I.Daubechies.Ten Lectures on Wavelets SIAM,1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4.G.Eckart and G.Young.The approximation of one matrix by another of lower rank.Psychometrika 1:211 -218,1936.Google ScholarGoogle Scholar
  5. 5.A.Haar.Zur theorie der orthogonalen funktionensysteme.Annals of Mathematics 69:331 -371,1910.Google ScholarGoogle Scholar
  6. 6.A.Khokhar,P.Thulasiraman,G.Heber,and G.Gao. Load adaptive algorithms and implementations for the 2d discrete wavelet transform on fine-grain multithreaded architectures.In Proceedings of IPPS/PDPS 199 EEE Press,1999. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Unitary operators for fast latent semantic indexing (FLSI)

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
            September 2001
            454 pages
            ISBN:1581133316
            DOI:10.1145/383952

            Copyright © 2001 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 1 September 2001

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • Article

            Acceptance Rates

            SIGIR '01 Paper Acceptance Rate47of201submissions,23%Overall Acceptance Rate792of3,983submissions,20%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader