skip to main content
10.1145/3234944.3234967acmconferencesArticle/Chapter ViewAbstractPublication PagesictirConference Proceedingsconference-collections
research-article

Beyond Greedy Search: Pruned Exhaustive Search for Diversified Result Ranking

Published: 10 September 2018 Publication History

Abstract

As a search query can correspond to multiple intents, search result diversification aims at returning a single result list that could satisfy as many users' information needs as possible. However, determining the optimal ranking list is NP-hard. Several algorithms have been proposed to obtain a local optimal ranking with greedy approximations. In this paper, we propose a pruned exhaustive method to generate better solutions than the greedy search. Our approach is based on the observations that there are fewer than ten subtopics for most queries, most relevant results cover only a few subtopics, and most search users only focus on the top results. The proposed pruned exhaustive search algorithm based on ordered pairs (PesOP) finds the optimal solution efficiently. Experimental results based on TREC Diversity and NTCIR Intent task datasets show that PesOP outperforms greedy strategies with better diversification performance. Compared with the original non-pruned exhaustive search, the PesOP algorithm decreases the computational cost while maintaining optimality.

References

[1]
Rakesh Agrawal, Sreenivas Gollapudi, Alan Halverson, and Samuel Ieong . 2009. Diversifying search results. In Proceedings of the 2nd International Conference on Web Search and Data Mining. 5--14.
[2]
Gabriele Capannini, Franco Maria Nardini, Raffaele Perego, and Fabrizio Silvestri . 2011. Efficient diversification of web search results. Proceedings of the VLDB Endowment Vol. 4, 7 (2011), 451--459.
[3]
Jaime Carbonell and Jade Goldstein . 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st ACM SIGIR conference on Research and development in information retrieval. 335--336.
[4]
Ben Carterette . 2011. An analysis of NP-completeness in novelty and diversity ranking. Information Retrieval Vol. 14, 1 (2011), 89--106.
[5]
Ben Carterette and Praveen Chandar . 2009. Probabilistic models of ranking novel documents for faceted topic retrieval Proceedings of the 18th ACM conference on Information and knowledge management. ACM, 1287--1296.
[6]
Olivier Chapelle, Shihao Ji, Ciya Liao, Emre Velipasaoglu, Larry Lai, and Su-Lin Wu . 2011. Intent-based diversification of web search results: metrics and algorithms. Information Retrieval Vol. 14, 6 (2011), 572--592.
[7]
Olivier Chapelle, Donald Metlzer, Ya Zhang, and Pierre Grinspan . 2009. Expected reciprocal rank for graded relevance. In Proceedings of the 18th ACM conference on Information and knowledge management. ACM, 621--630.
[8]
Charles LA Clarke, Maheedhar Kolla, Gordon V Cormack, Olga Vechtomova, Azin Ashkan, Stefan Büttcher, and Ian MacKinnon . 2008. Novelty and diversity in information retrieval evaluation Proceedings of the 31st ACM SIGIR conference on Research and development in information retrieval. 659--666.
[9]
Charles LA Clarke, Maheedhar Kolla, and Olga Vechtomova . 2009. An effectiveness measure for ambiguous and underspecified queries Proceedings of the 2nd ACM SIGIR International Conference on Theory of Information Retrieval. 188--199.
[10]
Van Dang and W Bruce Croft . 2012. Diversity by proportionality: an election-based approach to search result diversification. In Proceedings of the 35th ACM SIGIR conference on Research and development in information retrieval. 65--74.
[11]
Zhicheng Dou, Sha Hu, Kun Chen, Ruihua Song, and Ji-Rong Wen . 2011. Multi-dimensional search result diversification. In Proceedings of the 4th International Conference on Web Search and Data Mining. 475--484.
[12]
Marina Drosou and Evaggelia Pitoura . 2012. Dynamic diversification of continuous data. In Proceedings of the 15th International Conference on Extending Database Technology. ACM, 216--227.
[13]
Veronica Gil-Costa, Rodrygo LT Santos, Craig Macdonald, and Iadh Ounis . 2013. Modelling efficient novelty-based search result diversification in metric spaces. Journal of Discrete Algorithms Vol. 18 (2013), 75--88.
[14]
Jiyin He, Edgar Meij, and Maarten de Rijke . 2011. Result diversification based on query-specific cluster ranking. Journal of the Association for Information Science and Technology Vol. 62, 3 (2011), 550--571.
[15]
Neil Hurley and Mi Zhang . 2011. Novelty and diversity in top-n recommendation--analysis and evaluation. ACM Transactions on Internet Technology (TOIT) Vol. 10, 4 (2011), 14.
[16]
Qiaozhu Mei, Jian Guo, and Dragomir Radev . 2010. Divrank: the interplay of prestige and diversity in information networks Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. Acm, 1009--1018.
[17]
David R Morrison, Sheldon H Jacobson, Jason J Sauppe, and Edward C Sewell . 2016. Branch-and-bound algorithms: A survey of recent advances in searching, branching, and pruning. Discrete Optimization Vol. 19 (2016), 79--102.
[18]
Lukas Neumann and Jiri Matas . 2011. Text localization in real-world images using efficiently pruned exhaustive search Document Analysis and Recognition (ICDAR), 2011 International Conference on. IEEE, 687--691.
[19]
Ahmet Murat Ozdemiray and Ismail Sengor Altingovde . 2015. Explicit search result diversification using score and rank aggregation methods. Journal of the Association for Information Science and Technology Vol. 66, 6 (2015), 1212--1228.
[20]
Judea Pearl . 1980. SCOUT: A Simple Game-Searching Algorithm with Proven Optimal Properties. AAAI. 143--145.
[21]
Judea Pearl . 1982. The solution for the branching factor of the alpha-beta pruning algorithm and its optimality. Commun. ACM Vol. 25, 8 (1982), 559--564.
[22]
Aske Plaat, Jonathan Schaeffer, Wim Pijls, and Arie De Bruin . 1996. Best-first fixed-depth minimax algorithms. Artificial Intelligence Vol. 87, 1--2 (1996), 255--293.
[23]
Davood Rafiei, Krishna Bharat, and Anand Shukla . 2010. Diversifying web search results. In Proceedings of the 19th international conference on World Wide Web. ACM, 781--790.
[24]
Marco Tulio Ribeiro, Anisio Lacerda, Adriano Veloso, and Nivio Ziviani . 2012. Pareto-efficient hybridization for multi-objective recommender systems Proceedings of the sixth ACM conference on Recommender systems. ACM, 19--26.
[25]
Tetsuya Sakai . 2012. Evaluation with informational and navigational intents Proceedings of the 21st international conference on World Wide Web. ACM, 499--508.
[26]
Tetsuya Sakai, Zhicheng Dou, Takehiro Yamamoto, Yiqun Liu, Min Zhang, Ruihua Song, MP Kato, and M Iwata . 2013. Overview of the NTCIR-10 INTENT-2 Task. In NTCIR.
[27]
Rodrygo LT Santos, Craig Macdonald, and Iadh Ounis . 2010 a. Exploiting query reformulations for web search result diversification Proceedings of the 19th international conference on World Wide Web. ACM, 881--890.
[28]
Rodrygo LT Santos, Craig Macdonald, and Iadh Ounis . 2010 b. Selectively diversifying web search results. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management. ACM, 1179--1188.
[29]
Rodrygo LT Santos, Craig Macdonald, Iadh Ounis, et almbox. . 2015. Search result diversification. Foundations and Trends® in Information Retrieval Vol. 9, 1 (2015), 1--90.
[30]
Ruihua Song, Zhicheng Dou, Hsiao-Wuen Hon, and Yong Yu . 2010. Learning query ambiguity models by using search logs. Journal of Computer Science and Technology Vol. 25, 4 (2010), 728--738.
[31]
Ruihua Song, Min Zhang, Tetsuya Sakai, Makoto P Kato, Yiqun Liu, Miho Sugimoto, Qinglei Wang, and Naoki Orii . 2011. Overview of the NTCIR-9 INTENT Task. In NTCIR. Citeseer.
[32]
Jun Wang and Jianhan Zhu . 2009. Portfolio theory of information retrieval. In Proceedings of the 32nd ACM SIGIR conference on Research and development in information retrieval. 115--122.
[33]
Yingying Wu, Yiqun Liu, Ke Zhou, Xiaochuan Wang, Min Zhang, and Shaoping Ma . 2018. Treating Each Intent Equally: The Equilibrium of IA-Select Companion of the The Web Conference 2018 on The Web Conference 2018. International World Wide Web Conferences Steering Committee, 113--114.
[34]
Yufei Xue, Fei Chen, Aymeric Damien, Cheng Luo, Xin Li, Shuai Huo, Min Zhang, Yiqun Liu, and Shaoping Ma . 2013. THUIR at NTCIR-10 INTENT-2 Task. In NTCIR.
[35]
Yufei Xue, Fei Chen, Tong Zhu, Chao Wang, Zhichao Li, Yiqun Liu, Min Zhang, Yijiang Jin, and Shaoping Ma . 2011. THUIR at NTCIR-9 INTENT Task. In NTCIR. Citeseer.
[36]
Cong Yu, Laks Lakshmanan, and Sihem Amer-Yahia . 2009. It takes variety to make a world: diversification in recommender systems Proceedings of the 12th international conference on extending database technology: Advances in database technology. ACM, 368--378.
[37]
Long Yuan, Lu Qin, Xuemin Lin, Lijun Chang, and Wenjie Zhang . 2016. Diversified top-k clique search. The VLDB Journal-The International Journal on Very Large Data Bases Vol. 25, 2 (2016), 171--196.
[38]
Min Zhang, Chuan Lin, Yiqun Liu, Leo Zhao, and Shaoping Ma . 2003. THUIR at TREC 2003: Novelty, Robust and Web. In TREC. 556--567.
[39]
Le Zhao, Min Zhang, and Shaoping Ma . 2006. The nature of novelty detection. Information Retrieval Vol. 9, 5 (2006), 521--541.
[40]
Guido Zuccon, Leif Azzopardi, Dell Zhang, and Jun Wang . 2012. Top-k retrieval using facility location analysis. In European Conference on Information Retrieval. Springer, 305--316.

Cited By

View all
  • (2022)A generic framework for efficient computation of top-k diverse resultsThe VLDB Journal10.1007/s00778-022-00770-032:4(737-761)Online publication date: 28-Nov-2022

Index Terms

  1. Beyond Greedy Search: Pruned Exhaustive Search for Diversified Result Ranking

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICTIR '18: Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval
    September 2018
    238 pages
    ISBN:9781450356565
    DOI:10.1145/3234944
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 September 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. retrieval models
    2. search process
    3. web search

    Qualifiers

    • Research-article

    Funding Sources

    • National Key Basic Research Program
    • Natural Science Foundation of China
    • NIH

    Conference

    ICTIR '18
    Sponsor:

    Acceptance Rates

    ICTIR '18 Paper Acceptance Rate 19 of 47 submissions, 40%;
    Overall Acceptance Rate 235 of 527 submissions, 45%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 27 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)A generic framework for efficient computation of top-k diverse resultsThe VLDB Journal10.1007/s00778-022-00770-032:4(737-761)Online publication date: 28-Nov-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media