skip to main content
10.1145/1277741.1277930acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Random walk term weighting for information retrieval

Published:23 July 2007Publication History

ABSTRACT

We present a way of estimating term weights for Information Retrieval (IR), using term co-occurrence as a measure of dependency between terms.We use the random walk graph-based ranking algorithm on a graph that encodes terms and co-occurrence dependencies in text, from which we derive term weights that represent a quantification of how a term contributes to its context. Evaluation on two TREC collections and 350 topics shows that the random walk-based term weights perform at least comparably to the traditional tf-idf term weighting, while they outperform it when the distance between co-occurring terms is between 6 and 30 terms.

References

  1. G. Erkan and D. Radev. LexRank: Graph-based Lexical Centrality as Salience in Text Summarization. In Journal of Artificial Intelligence Research. 22, 457--479, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Hassan and C. Banea. Random-Walk Term Weighting for Improved Text Classification. In Proceedings of TextGraphs: 2nd Workshop on Graph Based Methods for Natural Language Processing. ACL, 53--60, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Mihalcea and P.Tarau. TextRank: Bringing Order into Texts. In Proceedings of Empirical Methods in Natural Language Processing. ACL, 404--411, 2006.Google ScholarGoogle Scholar
  4. L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank Citation Ranking: Bringing Order to the Web. Technical report, Stanford Digital Library Technologies Project, 1998.Google ScholarGoogle Scholar

Index Terms

  1. Random walk term weighting for information retrieval

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
      July 2007
      946 pages
      ISBN:9781595935977
      DOI:10.1145/1277741

      Copyright © 2007 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 July 2007

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader