skip to main content
10.1145/1076034.1076120acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Improving web search results using affinity graph

Authors Info & Claims
Published:15 August 2005Publication History

ABSTRACT

In this paper, we propose a novel ranking scheme named Affinity Ranking (AR) to re-rank search results by optimizing two metrics: (1) diversity -- which indicates the variance of topics in a group of documents; (2) information richness -- which measures the coverage of a single document to its topic. Both of the two metrics are calculated from a directed link graph named Affinity Graph (AG). AG models the structure of a group of documents based on the asymmetric content similarities between each pair of documents. Experimental results in Yahoo! Directory, ODP Data, and Newsgroup data demonstrate that our proposed ranking algorithm significantly improves the search performance. Specifically, the algorithm achieves 31% improvement in diversity and 12% improvement in information richness relatively within the top 10 search results.

References

  1. Baeza-Yates, R. and Ribeiro-Neto, B. Modern Information Retrieval. Addison Wesley Longman, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Calvo, R.A., Lee, J.-M. and Li, X. Managing Content with Automatic Document Classification. Journal of Digital Information, 5 (2).Google ScholarGoogle Scholar
  3. Carbonell, J. and Goldstein, J., The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, (Melbourne, Australia, 1998), 335--336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Chen, Z., Tao, L., Wang, J., Liu, W. and Ma, W.-Y., A Unified Framework for Web Link Analysis. In Proceedings of the 3rd International Conference on Web Information Systems Engineering, (Singapore, 2002), 63--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Croft, W.B., Cronen-Townsend, S. and Larvrenko, V., Relevance feedback and personalization: A language modeling perspective. In Proceedings of the DELOS Network of Excellence Workshop on "Personalisation and Recommender Systems in Digital Libraries", (Dublin City University, Ireland, 2001).Google ScholarGoogle Scholar
  6. DirectHit. http://www.directhit.com.Google ScholarGoogle Scholar
  7. Dumais, S. and Chen, H., Hierarchical classification of Web content. In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, (Athens, Greece, 2000), 256--263. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Gibson, D., Kleinberg, J.M. and Raghavan, P., Inferring Web communities from link topology. In Proceedings of the 9th ACM Conference on Hypertext and Hypermedia, (Pittsburgh, PA, 1998), 225--234. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Kleinberg, J.M. Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM), 46 (5). 604--632. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Lu, Q. and Getoor, L., Link-based Classification. In Proceedings of the International Conference on Machine Learning, (Washington DC, 2003), 496--503.Google ScholarGoogle Scholar
  11. ODP. http://dmoz.org/.Google ScholarGoogle Scholar
  12. Page, L., Brin, S., Motwani, R. and Windograd, T. The pagerank citation ranking: Bring order to the web, Stanford Digital Library Technologies Project, 1998.Google ScholarGoogle Scholar
  13. Porter, M.F. An algorithm for suffix stripping Program, 1980, 130--137.Google ScholarGoogle Scholar
  14. Robertson, S.E., Walker, S., Hancock-Beaulieu, M., Gull, A. and Lau, M., Okapi at TREC. In Proceedings of the Text REtrieval Conference, (1992), 21--30.Google ScholarGoogle Scholar
  15. Wong, S.K.M. and Raghavan, V.V., Vector space model of information retrieval: a reevaluation. In Proceedings of the 7th annual international ACM SIGIR conference on Research and development in information retrieval, (Cambridge, England, 1984), 167--185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Xi, W., Zhang, B., Chen, Z., Lu, Y., Yan, S., Ma, W.-Y. and Fox, E.A., Link fusion: a unified link analysis framework for multi-type interrelated data objects. In Proceedings of the 13th international conference on World Wide Web, (New York, NY, USA, 2004), 319--327. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Xue, G.-R., Zeng, H.-J., Chen, Z., Ma, W.-Y., Zhang, H.-J. and Lu, C.-J., Implicit link analysis for small web search. In Proceedings of the 26th annual international ACM SIGIR conference on Research and Development in Information Retrieval, (Toronto, Canada, 2003), 56--63. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Zhai, C.X., Cohen, W.W. and Lafferty, J., Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval, (Toronto, Canada, 2003), 10--17. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Improving web search results using affinity graph

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
          August 2005
          708 pages
          ISBN:1595930345
          DOI:10.1145/1076034

          Copyright © 2005 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 15 August 2005

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate792of3,983submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader