research-article

Online Learning to Rank for Information Retrieval: SIGIR 2016 Tutorial

Authors:
Artem Grotov

University of Amsterdam, Amsterdam, Netherlands

University of Amsterdam, Amsterdam, Netherlands
View Profile

,
Maarten de Rijke

University of Amsterdam, Amsterdam, Netherlands

University of Amsterdam, Amsterdam, Netherlands
View Profile

SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information RetrievalJuly 2016Pages 1215–1218https://doi.org/10.1145/2911451.2914798

Published:07 July 2016Publication History

SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

Pages 1215–1218

ABSTRACT

During the past 10--15 years offline learning to rank has had a tremendous influence on information retrieval, both scientifically and in practice. Recently, as the limitations of offline learning to rank for information retrieval have become apparent, there is increased attention for online learning to rank methods for information retrieval in the community. Such methods learn from user interactions rather than from a set of labeled data that is fully available for training up front.

Below we describe why we believe that the time is right for an intermediate-level tutorial on online learning to rank, the objectives of the proposed tutorial, its relevance, as well as more practical details, such as format, schedule and support materials.

References

Peter Auer. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learnng Research, 3: 397--422, 2003. Google ScholarDigital Library
Alexey Borisov, Pavel Serdyukov, and Maarten de Rijke. Using metafeatures to increase the effectiveness of latent semantic models in web search. In WWW 2016: 25th International World Wide Web Conference. ACM, April 2016. Google ScholarDigital Library
Christopher J.C. Burges. From ranknet to lambdarank to lambdamart: An overview. Technical Report MSR-TR-2010-82, June 2010.Google Scholar
Giuseppe Burtini, Jason Loeppky, and Ramon Lawrence. A survey of online experiment design with the stochastic multi-armed bandit. CoRR, abs/1510.00757, 2015. URL http://arxiv.org/abs/1510.00757.Google Scholar
Róbert Busa-Fekete and Eyke Hüllermeier. A survey of preference-based online learning with bandit algorithms. In Algorithmic Learning Theory: 25th International Conference, ALT 2014, Bled, Slovenia, October 8-10, 2014. Proceedings, pages 18--39, Cham, 2014. Springer International Publishing.Google ScholarCross Ref
Róbert Busa-Fekete and Eyke Hüllermeier. A survey of preference-based online learning with bandit algorithms. In ALT '14, number 8776 in LNCS, pages 18--39. Springer, 2014.Google ScholarCross Ref
Susan T. Dumais. The web changes everything: Understanding and supporting people in dynamic information environments. In Research and Advanced Technology for Digital Libraries, 14th European Conference, ECDL 2010, 2010. Google ScholarDigital Library
Artem Grotov and Maarten de Rijke. Online learning to rank for information retrieval: A survey. Draft, 2016.Google Scholar
Artem Grotov, Shimon Whiteson, and Maarten de Rijke. Bayesian ranker comparison based on historical user interactions. In SIGIR 2015: 38th international ACM SIGIR conference on Research and development in information retrieval. ACM, August 2015. Google ScholarDigital Library
Artem Grotov, Maarten de Rijke, and Shimon Whiteson. Online LambdaRank. In Submitted, 2016.Google Scholar
Katja Hofmann, Shimon Whiteson, and Maarten de Rijke. Balancing exploration and exploitation in learning to rank online. In ECIR 2011: 33rd European Conference on Information Retrieval. Springer, April 2011. Google ScholarDigital Library
Katja Hofmann, Anne Schuth, Shimon Whiteson, and Maarten de Rijke. Reusing historical interaction data for faster online learning to rank for information retrieval. In WSDM 2013: International Conference on Web Search and Data Mining. ACM, February 2013. Google ScholarDigital Library
Katja Hofmann, Shimon Whiteson, and Maarten de Rijke. Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval. Information Retrieval Journal, 16 (1): 63--90, February 2013. Google ScholarDigital Library
Katja Hofmann, Shimon Whiteson, and Maarten de Rijke. Fidelity, soundness, and efficiency of interleaved comparison methods. ACM Transactions on Information Systems, 31 (3): Article 18, October 2013. Google ScholarDigital Library
Katja Hofmann, Shimon Whiteson, Anne Schuth, and Maarten de Rijke. Learning to rank for information retrieval from user interactions. ACM SIGWEB Newsletter, (Spring): 5:1--5:7, April 2014. Google ScholarDigital Library
Thorsten Joachims. Optimizing search engines using clickthrough data. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '02, pages 133--142, New York, NY, USA, 2002. ACM. Google ScholarDigital Library
Youngho Kim, Ahmed Hassan, Ryen W. White, and Imed Zitouni. Modeling dwell time to predict click-level satisfaction. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining, WSDM '14, pages 193--202, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
Ron Kohavi, Roger Longbotham, Dan Sommerfield, and Randal M. Henne. Controlled experiments on the web: Survey and practical guide. Data Mining and Knowledge Discovery, 18 (1): 140--181, 2009. Google ScholarDigital Library
Branislav Kveton, Csaba Szepesvári, Zheng Wen, and Azin Ashkan. Cascading bandits. CoRR, abs/1502.02763, 2015. URL http://arxiv.org/abs/1502.02763.Google Scholar
John Langford and Tong Zhang. The epoch-greedy algorithm for multi-armed bandits with side information. In J. C. Platt, D. Koller, Y. Singer, and S. T. Roweis, editors, Advances in Neural Information Processing Systems 20, pages 817--824. Curran Associates, Inc., 2008.Google Scholar
Damien Lefortier, Pavel Serdyukov, and Maarten de Rijke. Online exploration for detecting shifts in fresh intent. In CIKM 2014: 23rd ACM Conference on Information and Knowledge Management. ACM, November 2014. Google ScholarDigital Library
Tie-Yan Liu. Learning to rank for information retrieval. Found. Trends Inf. Retr., 3 (3): 225--331, March 2009. Google ScholarDigital Library
Harrie Oosterhuis, Anne Schuth, and Maarten de Rijke. Probabilistic multileave gradient descent. In ECIR 2016: 38th European Conference on Information Retrieval, LNCS. Springer, March 2016.Google ScholarCross Ref
Mark Sanderson. Test collection based evaluation of information retrieval systems. Found. & Tr. Inform. Retr., 4 (4): 247--375, 2010.Google ScholarCross Ref
Anne Schuth, Katja Hofmann, Shimon Whiteson, and Maarten de Rijke. Lerot: an online learning to rank framework. In Living Labs for Information Retrieval Evaluation workshop at CIKM'13., 2013. Google ScholarDigital Library
Anne Schuth, Krisztian Balog, and Liadh Kelly. Overview of the living labs for information retrieval evaluation (ll4ir) clef lab 2015. In Experimental IR Meets Multilinguality, Multimodality, and Interaction, pages 484--496. Springer, 2015. Google ScholarDigital Library
Anne Schuth, Harrie Oosterhuis, Shimon Whiteson, and Maarten de Rijke. Multileave gradient descent for fast online learning to rank. In WSDM 2016: The 9th International Conference on Web Search and Data Mining. ACM, February 2016. Google ScholarDigital Library
Aleksandrs Slivkins, Filip Radlinski, and Sreenivas Gollapudi. Ranked bandits in metric spaces: learning optimally diverse rankings over large document collections. Technical report, arXiv preprint arXiv:1005.5197, 2010.Google Scholar
Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. MIT Press., 1998. Google ScholarDigital Library
Adith Swaminathan and Thorsten Joachims. Counterfactual risk minimization: Learning from logged bandit feedback. CoRR, abs/1502.02362, 2015. URL http://arxiv.org/abs/1502.02362.Google ScholarDigital Library
Aibo Tian and Matthew Lease. Active learning to maximize accuracy vs. effort in interactive information retrieval. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '11, pages 145--154, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
016)}trec-openTREC. OpenSearch track. http://trec-open-search.org, 2016.Google Scholar
Aleksandr Vorobev, Damien Lefortier, Gleb Gusev, and Pavel Serdyukov. Gathering additional feedback on search results by multi-armed bandits with respect to production ranking. In Proceedings of the 24th International Conference on World Wide Web, pages 1177--1187. ACM, 2015. Google ScholarDigital Library
Yisong Yue and Thorsten Joachims. Interactively optimizing information retrieval systems as a dueling bandits problem. In ICML '09, 2009. Google ScholarDigital Library
Masrour Zoghi, Shimon Whiteson, Maarten de Rijke, and Remi Munos. Relative confidence sampling for efficient on-line ranker evaluation. In 7th ACM WSDM Conference (WSDM2014). ACM, February 2014. Google ScholarDigital Library
Masrour Zoghi, Shimon Whiteson, Remi Munos, and Maarten de Rijke. Relative upper confidence bound for the k-armed dueling bandit problem. In ICML 2014: International Conference on Machine Learning, June 2014.Google Scholar
Masrour Zoghi, Shimon Whiteson, and Maarten de Rijke. Mergerucb: A method for large-scale online ranker evaluation. In WSDM 2015: The Eighth International Conference on Web Search and Data Mining. ACM, February 2015. Google ScholarDigital Library
Masrour Zoghi, Shimon Whiteson, Zohar Karnin, and Maarten de Rijke. Copeland dueling bandits. In NIPS 2015, December 2015. Google ScholarDigital Library
Masrour Zoghi, Tomáš Tunys, Lihong Li, Damien Jose, Junyan Chen, Chun Ming Chin, and Maarten de Rijke. Click-based hot fixes for underperforming torso queries. In SIGIR 2016: 39th international ACM SIGIR conference on Research and development in information retrieval. ACM, July 2016. Google ScholarDigital Library

Index Terms

Online Learning to Rank for Information Retrieval: SIGIR 2016 Tutorial
1. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results
    2. Retrieval models and ranking
      1. Learning to rank

Recommendations

Online Learning to Rank for Cross-Language Information Retrieval
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

Online learning to rank for information retrieval has shown great promise in optimization of Web search results based on user interactions. However, online learning to rank has been used only in the monolingual setting where queries and documents are in ...
Read More
How do Online Learning to Rank Methods Adapt to Changes of Intent?
SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval

Online learning to rank (OLTR) uses interaction data, such as clicks, to dynamically update rankers. OLTR has been thought to capture user intent change overtime - a task that is impossible for rankers trained on statistic datasets such as in offline and ...
Read More
Reinforcement online learning to rank with unbiased reward shaping
Abstract
Online learning to rank (OLTR) aims to learn a ranker directly from implicit feedback derived from users’ interactions, such as clicks. Clicks however are a biased signal: specifically, top-ranked documents are likely to attract more clicks than ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
July 2016
1296 pages
ISBN:9781450340694
DOI:10.1145/2911451
General Chairs:
Raffaele Perego
ISTI-CNR, Italy
,
Fabrizio Sebastiani
Qatar Computing Research Institute, HBKU, Qatar
,
Program Chairs:
Javed Aslam
Northeastern University, US
,
Ian Ruthven
University of Strathclyde, UK
,
Justin Zobel
University of Melbourne, Australia
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 July 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
bandit algorithms
exploration vs. exploitation
online learning to rank
Qualifiers
- research-article
Conference

Acceptance Rates
SIGIR '16 Paper Acceptance Rate62of341submissions,18%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 32
  Total Citations
  View Citations
- 687
  Total Downloads
- Downloads (Last 12 months)41
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Online Learning to Rank for Information Retrieval: SIGIR 2016 Tutorial

SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Online Learning to Rank for Cross-Language Information Retrieval

How do Online Learning to Rank Methods Adapt to Changes of Intent?

Reinforcement online learning to rank with unbiased reward shaping