skip to main content
10.1145/2600428.2609634acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Learning for search result diversification

Published: 03 July 2014 Publication History

Abstract

Search result diversification has gained attention as a way to tackle the ambiguous or multi-faceted information needs of users. Most existing methods on this problem utilize a heuristic predefined ranking function, where limited features can be incorporated and extensive tuning is required for different settings. In this paper, we address search result diversification as a learning problem, and introduce a novel relational learning-to-rank approach to formulate the task. However, the definitions of ranking function and loss function for the diversification problem are challenging. In our work, we firstly show that diverse ranking is in general a sequential selection process from both empirical and theoretical aspects. On this basis, we define ranking function as the combination of relevance score and diversity score between the current document and those previously selected, and loss function as the likelihood loss of ground truth based on Plackett-Luce model, which can naturally model the sequential generation of a diverse ranking list. Stochastic gradient descent is then employed to conduct the unconstrained optimization, and the prediction of a diverse ranking list is provided by a sequential selection process based on the learned ranking function. The experimental results on the public TREC datasets demonstrate the effectiveness and robustness of our approach.

References

[1]
R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In Proceedings of the 2th ACM WSDM, pages 5--14, 2009.
[2]
C. Brandt, T. Joachims, Y. Yue, and J. Bank. Dynamic ranked retrieval. In Proceedings of the 4th ACM WSDM, pages 247--256, 2011.
[3]
J. Carbonell and J. Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st ACM SIGIR, pages 335--336, 1998.
[4]
B. Carterette. An analysis of np-completeness in novelty and diversity ranking. In Proceedings of the 2nd ICTIR, 2009.
[5]
B. Carterette and P. Chandar. Probabilistic models of ranking novel documents for faceted topic retrieval. In Proceedings of the 18th ACM CIKM, pages 1287--1296, 2009.
[6]
O. Chapelle, D. Metlzer, Y. Zhang, and P. Grinspan. Expected reciprocal rank for graded relevance. In Proceedings of the 18th ACM CIKM, pages 621--630, 2009.
[7]
C. L. Clarke, N. Craswell, and I. Soboroff. Overview of the trec 2009 web track. In TREC, 2009.
[8]
C. L. Clarke, N. Craswell, I. Soboroff, and A. Ashkan. A comparative analysis of cascade measures for novelty and diversity. In Proceedings of the 4th ACM WSDM, pages 75--84, 2011.
[9]
C. L. Clarke, N. Craswell, I. Soboroff, and E. M.Voorhees. Overview of the trec 2011 web track. In TREC, 2011.
[10]
C. L. Clarke, N. Craswell, I. Soboroff, and G. V.Cormack. Overview of the trec 2010 web track. In TREC, 2010.
[11]
C. L. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Büttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In Proceedings of the 31st ACM SIGIR, pages 659--666, 2008.
[12]
C. L. Clarke, M. Kolla, and O. Vechtomova. An effectiveness measure for ambiguous and underspecified queries. In Proceedings of the 2nd ICTIR, pages 188--199, 2009.
[13]
V. Dang and W. B. Croft. Diversity by proportionality: an election-based approach to search result diversification. In Proceedings of the 35th ACM SIGIR, pages 65--74, 2012.
[14]
S. Gollapudi and A. Sharma. An axiomatic approach for result diversification. In Proceedings of the 18th WWW, pages 381--390, 2009.
[15]
J. He, V. Hollink, and A. de Vries. Combining implicit and explicit topic representations for result diversification. In Proceedings of the 35th ACM SIGIR, pages 851--860, 2012.
[16]
T. Hofmann. Probabilistic latent semantic indexing. In Proceedings of the 22nd ACM SIGIR, pages 50--57, 1999.
[17]
T.-Y. Liu. Learning to Rank for Information Retrieval. Springer, 2011.
[18]
J. I. Marden. Analyzing and Modeling Rank Data. Chapman and Hall, 1995.
[19]
D. Metzler and W. B. Croft. A markov random field model for term dependencies. In Proceedings of the 28th ACM SIGIR, pages 472--479, 2005.
[20]
G. Nemhauser, L. Wolsey, and M. Fisher. An analysis of approximations for maximizing submodular set functions--i. Mathematical Programming, 14(1):265--294, 1978.
[21]
T. Qin, T.-Y. Liu, J. Xu, and H. Li. Letor: A benchmark collection for research on learning to rank for information retrieval. Inf. Retr., pages 346--374, 2010.
[22]
T. Qin, T.-Y. Liu, X.-D. Zhang, D.-S. Wang, and H. Li. Globalm ranking using continuous conditional random fields. In Proceedings of the 22th NIPS, Vancouver, British Columbia, Canada, December 8--11, 2008, pages 1281--1288, 2008.
[23]
T. Qin, T.-Y. Liu, X.-D. Zhang, D.-S. Wang, W.-Y. Xiong, and H. Li. Learning to rank relational objects and its application to web search. In Proceedings of the 17th WWW, pages 407--416, 2008.
[24]
F. Radlinski and S. Dumais. Improving personalized web search using result diversification. In Proceedings of the 29th ACM SIGIR, 2006.
[25]
F. Radlinski, R. Kleinberg, and T. Joachims. Learning diverse rankings with multi-armed bandits. In Proceedings of the 25th ICML, pages 784--791, 2008.
[26]
D. Rafiei, K. Bharat, and A. Shukla. Diversifying web search results. In Proceedings of the 19th WWW, pages 781--790, 2010.
[27]
K. Raman, T. Joachims, and P. Shivaswamy. Structured learning of two-level dynamic rankings. In Proceedings of the 20th ACM CIKM, pages 291--296, 2011.
[28]
K. Raman, P. Shivaswamy, and T. Joachims. Online learning to diversify from implicit feedback. In Proceedings of the 18th ACM SIGKDD, pages 705--713, 2012.
[29]
R. L. Santos, C. Macdonald, and I. Ounis. Exploiting query reformulations for web search result diversification. In Proceedings of the 19th WWW, pages 881--890, 2010.
[30]
P. Shivaswamy and T. Joachims. Online structured prediction via coactive learning. In ICML'12, 2012.
[31]
A. Slivkins, F. Radlinski, and S. Gollapudi. Learning optimally diverse rankings over large document collections. In Proceedings of the 27th ICML, pages 983--990, 2010.
[32]
S. Vargas, P. Castells, and D. Vallet. Explicit relevance models in intent-oriented information retrieval diversification. In Proceedings of the 35th ACM SIGIR, pages 75--84, 2012.
[33]
J. Wang and J. Zhu. Portfolio theory of information retrieval. In Proceedings of the 32nd ACM SIGIR, pages 115--122, 2009.
[34]
F. Xia, T.-Y. Liu, J. Wang, W. Zhang, and H. Li. Listwise approach to learning to rank: theory and algorithm. In Proceedings of the 25th ICML, pages 1192--1199, 2008.
[35]
Y. Yue and C. Guestrin. Linear submodular bandits and their application to diversified retrieval. In NIPS, pages 2483--2491, 2011.
[36]
Y. Yue and T. Joachims. Predicting diverse subsets using structural svms. In Proceedings of the 25th ICML, pages 1224--1231, 2008.
[37]
C. X. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In Proc. of the 26th ACM SIGIR, pages 10--17, 2003.
[38]
Y. Zhu, Y. Xue, J. Guo, Y. Lan, X. Cheng, and X. Yu. Exploring and exploiting proximity statistic for information retrieval model. In Proceedings of the 8th Asia Information Retrieval Societies Conference, volume 7675 of Lecture Notes in Computer Science, pages 1--13, 2012.

Cited By

View all
  • (2024)Passage-aware Search Result DiversificationACM Transactions on Information Systems10.1145/365367242:5(1-29)Online publication date: 13-May-2024
  • (2024)Multi-grained Document Modeling for Search Result DiversificationACM Transactions on Information Systems10.1145/365285242:5(1-22)Online publication date: 27-Apr-2024
  • (2024)CL4DIV: A Contrastive Learning Framework for Search Result DiversificationProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635851(171-180)Online publication date: 4-Mar-2024
  • Show More Cited By

Index Terms

  1. Learning for search result diversification

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval
    July 2014
    1330 pages
    ISBN:9781450322577
    DOI:10.1145/2600428
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 July 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. diversity
    2. plackett-luce model
    3. relational learning-to-rank
    4. sequential selection

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SIGIR '14
    Sponsor:

    Acceptance Rates

    SIGIR '14 Paper Acceptance Rate 82 of 387 submissions, 21%;
    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)30
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Passage-aware Search Result DiversificationACM Transactions on Information Systems10.1145/365367242:5(1-29)Online publication date: 13-May-2024
    • (2024)Multi-grained Document Modeling for Search Result DiversificationACM Transactions on Information Systems10.1145/365285242:5(1-22)Online publication date: 27-Apr-2024
    • (2024)CL4DIV: A Contrastive Learning Framework for Search Result DiversificationProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635851(171-180)Online publication date: 4-Mar-2024
    • (2024)Integrated Personalized and Diversified Search Based on Search LogsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.329100636:2(694-707)Online publication date: 1-Feb-2024
    • (2023)Personalized and Diversified: Ranking Search Results in an Integrated WayACM Transactions on Information Systems10.1145/363198942:3(1-25)Online publication date: 9-Nov-2023
    • (2023)Result Diversification for Legal case RetrievalProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3624918.3625319(158-168)Online publication date: 26-Nov-2023
    • (2023)Search Result Diversification Using Query Aspects as BottlenecksProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615050(3040-3051)Online publication date: 21-Oct-2023
    • (2023)GDESA: Greedy Diversity Encoder with Self-attention for Search Results DiversificationACM Transactions on Information Systems10.1145/354410341:2(1-36)Online publication date: 3-Apr-2023
    • (2023)User Behavior Simulation for Search Result Re-rankingACM Transactions on Information Systems10.1145/351146941:1(1-35)Online publication date: 20-Jan-2023
    • (2023)Modeling Global-Local Subtopic Distribution with Hypergraph to Diversify Search Results2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191529(1-8)Online publication date: 18-Jun-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media