skip to main content
10.1145/2009916.2010091acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
poster

Temporal latent semantic analysis for collaboratively generated content: preliminary results

Published:24 July 2011Publication History

ABSTRACT

Latent semantic analysis (LSA) has been intensively studied because of its wide application to Information Retrieval and Natural Language Processing. Yet, traditional models such as LSA only examine one (current) version of the document. However, due to the recent proliferation of collaboratively generated content such as threads in online forums, Collaborative Question Answering archives, Wikipedia, and other versioned content, the document generation process is now directly observable. In this study, we explore how this additional temporal information about the document evolution could be used to enhance the identification of latent document topics. Specifically, we propose a novel hidden-topic modeling algorithm, temporal Latent Semantic Analysis (tLSA), which elegantly extends LSA to modeling document revision history using tensor decomposition. Our experiments show that tLSA outperforms LSA on word relatedness estimation using benchmark data, and explore applications of tLSA for other tasks.

References

  1. A. Aji, Y. Wang, E. Agichtein, and E. Gabrilovich. Using the past to score the present: Extending term weighting models with revision history analysis. In CIKM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. D. Carroll and J. J. Chang. Analysis of individual differences in multidimensional scaling via an n-way generalization of eckart-young decomposition. Psychometrika, 35:283--319, 1970.Google ScholarGoogle ScholarCross RefCross Ref
  3. S. Deerwester, S. T. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. In JASIST, 1990.Google ScholarGoogle ScholarCross RefCross Ref
  4. K. Radinsky, E. Agichtein, E. Gabrilovich, and S. Markovitch. Word at a time: Computing word relatedness using temporal semantic analysis. In WWW, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Temporal latent semantic analysis for collaboratively generated content: preliminary results

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
        July 2011
        1374 pages
        ISBN:9781450307574
        DOI:10.1145/2009916

        Copyright © 2011 Authors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 24 July 2011

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • poster

        Acceptance Rates

        Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader