skip to main content
10.1145/2009916.2009942acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

A boosting approach to improving pseudo-relevance feedback

Authors Info & Claims
Published:24 July 2011Publication History

ABSTRACT

Pseudo-relevance feedback has proven effective for improving the average retrieval performance. Unfortunately, many experiments have shown that although pseudo-relevance feedback helps many queries, it also often hurts many other queries, limiting its usefulness in real retrieval applications. Thus an important, yet difficult challenge is to improve the overall effectiveness of pseudo-relevance feedback without sacrificing the performance of individual queries too much. In this paper, we propose a novel learning algorithm, FeedbackBoost, based on the boosting framework to improve pseudo-relevance feedback through optimizing the combination of a set of basis feedback algorithms using a loss function defined to directly measure both robustness and effectiveness. FeedbackBoost can potentially accommodate many basis feedback methods as features in the model, making the proposed method a general optimization framework for pseudo-relevance feedback. As an application, we apply FeedbackBoost to improve pseudo feedback based on language models through combining different document weighting strategies. The experiment results demonstrate that FeedbackBoost can achieve better average precision and meanwhile dramatically reduce the number and magnitude of feedback failures as compared to three representative pseudo feedback methods and a standard learning to rank approach for pseudo feedback.

References

  1. N. Abdul-Jaleel, J. Allan, W. B. Croft, F. Diaz, L. Larkey, X. Li, D. Metzler, M. D. Smucker, T. Strohman, H. Turtle, and C. Wade. Umass at trec 2004: Novelty and hard. In TREC'04, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  2. G. Amati, C. Carpineto, and G. Romano. Query difficulty, robustness, and selective application of query expansion. In ECIR'04, pages 127--137, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  3. C. Buckley, G. Salton, J. Allan, and A. Singhal. Automatic query expansion using smart: Trec 3. In TREC'94, pages 69--80, 1994.Google ScholarGoogle Scholar
  4. C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In ICML'05, pages 89--96, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. G. Cao, J.-Y. Nie, J. Gao, and S. Robertson. Selecting good expansion terms for pseudo-relevance feedback. In SIGIR, pages 243--250, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li. Learning to rank: from pairwise approach to listwise approach. In Proceedings of ICML, pages 129--136, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. Collins-Thompson. Reducing the risk of query expansion via robust constrained optimization. In CIKM'09, pages 837--846, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y. Freund, R. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res., 4:933--969, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. In EuroCOLT'95, pages 23--37, London, UK, 1995. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. Harman and C. Buckley. The nrrc reliable information access (ria) workshop. In SIGIR'04, pages 528--529, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  12. B. He and I. Ounis. Finding good feedback documents. In CIKM'09, pages 2011--2014, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of the ACM KDD 2002, pages 133--142, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. D. Lafferty and C. Zhai. Document language models, query models, and risk minimization for information retrieval. In SIGIR'01, pages 111--119, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. V. Lavrenko and W. B. Croft. Relevance-based language models. In SIGIR'01, pages 120--127, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T.-Y. Liu. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3):225--331, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Y. Lv and C. Zhai. Adaptive relevance feedback in information retrieval. In CIKM'09, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y. Lv and C. Zhai. A comparative study of methods for estimating query language models with pseudo feedback. In Proceedings of CIKM'09, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Y. Lv and C. Zhai. Positional language models for information retrieval. In SIGIR'09, pages 299--306, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Y. Lv and C. Zhai. Positional relevance model for pseudo-relevance feedback. In SIGIR'10, pages 579--586, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In SIGIR'98, pages 275--281, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. E. Robertson and K. S. Jones. Relevance weighting of search terms. JASIS, 27(3):129--146, 1976.Google ScholarGoogle ScholarCross RefCross Ref
  23. S. E. Robertson, S. Walker, S. Jones, M. Hancock-Beaulieu, and M. Gatford. Okapi at trec-3. In TREC'94, pages 109--126, 1994.Google ScholarGoogle Scholar
  24. J. J. Rocchio. Relevance feedback in information retrieval. In In The SMART Retrieval System: Experiments in Automatic Document Processing, pages 313--323. Prentice-Hall Inc., 1971.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. G. Salton and C. Buckley. Improving retrieval performance by relevance feedback. JASIS, 41(4):288--297, 1990.Google ScholarGoogle ScholarCross RefCross Ref
  26. R. E. Schapire and Y. Singer. Improved boosting algorithms using confidence-rated predictions. In COLT'98, pages 80--91, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. N. Soskin, O. Kurland, and C. Domshlak. Navigating in the dark: Modeling uncertainty in ad hoc retrieval using multiple relevance models. In ICTIR'09, pages 79--91, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. T. Tao and C. Zhai. Regularized estimation of mixture models for robust pseudo-relevance feedback. In SIGIR'06, pages 162--169, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. T. Tao and C. Zhai. An exploration of proximity measures in information retrieval. In SIGIR'07, pages 295--302, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. Xu and H. Li. Adarank: a boosting algorithm for information retrieval. In SIGIR'07, pages 391--398, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. C. Zhai and J. D. Lafferty. Model-based feedback in the language modeling approach to information retrieval. In CIKM'01, pages 403--410, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. C. Zhai and J. D. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In SIGIR'01, pages 334--342, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Z. Zheng, K. Chen, G. Sun, and H. Zha. A regression framework for learning ranking functions using relative relevance judgments. In SIGIR'07, pages 287--294, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A boosting approach to improving pseudo-relevance feedback

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
          July 2011
          1374 pages
          ISBN:9781450307574
          DOI:10.1145/2009916

          Copyright © 2011 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 July 2011

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate792of3,983submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader