skip to main content
10.1145/1963405.1963441acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Search result diversity for informational queries

Published: 28 March 2011 Publication History

Abstract

Ambiguous queries constitute a significant fraction of search instances and pose real challenges to web search engines. With current approaches the top results for these queries tend to be homogeneous, making it difficult for users interested in less popular aspects to find relevant documents. While existing research in search diversification offers several solutions for introducing variety into the results, the majority of such work is predicated, implicitly or otherwise, on the assumption that a single relevant document will fulfill a user's information need, making them inadequate for many informational queries. In this paper we present a search-diversification algorithm particularly suitable for informational queries by explicitly modeling that the user may need more than one page to satisfy their need. This modeling enables our algorithm to make a well-informed tradeoff between a user's desire for multiple relevant documents, probabilistic information about an average user's interest in the subtopics of a multifaceted query, and uncertainty in classifying documents into those subtopics. We evaluate the effectiveness of our algorithm against commercial search engine results and other modern ranking strategies, demonstrating notable improvement in multiple document scenarios.

References

[1]
R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In WSDM, pages 5--14, 2009.
[2]
D. M. Blei, A. Y. Ng, M. I. Jordan, and J. Lafferty. Latent dirichlet allocation. Journal of Machine Learning Research, 3:2003, 2003.
[3]
A. Broder. A taxonomy of web search. SIGIR Forum, 36(2):3--10, 2002.
[4]
J. Carbonell and J. Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In SIGIR, pages 335--336, 1998.
[5]
H. Chen and D. R. Karger. Less is more: probabilistic models for retrieving fewer relevant documents. In SIGIR, pages 429--436, 2006.
[6]
C. L. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Büttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In SIGIR, pages 659--666, 2008.
[7]
W. S. Cooper. Expected search length: A single measure of retrieval effectiveness based on the weak ordering action of retrieval systems, 1968.
[8]
C. Fellbaum. WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA, 1998.
[9]
T. L. Griffiths and M. Steyvers. Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America, 101:5228--5235, April 2004.
[10]
B. J. Jansen and A. Spink. How are we searching the world wide web?: a comparison of nine search engine transaction logs. Inf. Process. Manage., 42(1):248--263, 2006.
[11]
U. Lee, Z. Liu, and J. Cho. Automatic identification of user goals in web search. In WWW, pages 391--400, 2005.
[12]
F. Liu, C. Yu, and W. Meng. Personalized web search by mapping user queries to categories. In CIKM, pages 558--565, 2002.
[13]
B. U. Oztekin, G. Karypis, and V. Kumar. Expert agreement and content based reranking in a meta search environment using mearf. In WWW, pages 333--344, 2002.
[14]
X.-H. Phan and C.-T. Nguyen. http://gibbslda.sourceforge.net/.
[15]
A. Pretschner and S. Gauch. Ontology based personalized search. In ICTAI, pages 391--398, 1999.
[16]
F. Qiu, Z. Liu, and J. Cho. Analysis of user web traffic with a focus on search activities. In WebDB, 2005.
[17]
F. Radlinski and S. Dumais. Improving personalized web search using result diversification. In SIGIR, pages 691--692, 2006.
[18]
D. E. Rose and D. Levinson. Understanding user goals in web search. In WWW, pages 13--19, 2004.
[19]
M. Sanderson. Ambiguous queries: test collections need more sense. In SIGIR, pages 499--506, 2008.
[20]
R. Song, Z. Luo, J.-Y. Nie, Y. Yu, and H.-W. Hon. Identification of ambiguous queries in web search. Inf. Process. Manage., 45(2):216--229, 2009.
[21]
M. Steyvers and T. Griffiths. Probabilistic topic models. In Handbook of Latent Semantic Analysis. Lawrence Erlbaum Associates, 2007.
[22]
E. Voorhees. Overview of the trec 2004 robust retrieval track. In TREC, 2004.
[23]
J. Wang and J. Zhu. Portfolio theory of information retrieval. In SIGIR, pages 115--122, 2009.
[24]
C. X. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In SIGIR, pages 10--17, 2003.

Cited By

View all
  • (2022)Comprehensive Information Retrieval Using Fine-Tuned Bert Model and Topic-Assisted Query ExpansionAmbient Intelligence in Health Care10.1007/978-981-19-6068-0_12(117-132)Online publication date: 23-Nov-2022
  • (2021)Novelty and Diversity in Recommender SystemsRecommender Systems Handbook10.1007/978-1-0716-2197-4_16(603-646)Online publication date: 22-Nov-2021
  • (2021)Full coverage of a reader's interests in context‐based information filteringJournal of the Association for Information Science and Technology10.1002/asi.2447072:8(1011-1027)Online publication date: 5-Jul-2021
  • Show More Cited By

Index Terms

  1. Search result diversity for informational queries

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '11: Proceedings of the 20th international conference on World wide web
    March 2011
    840 pages
    ISBN:9781450306324
    DOI:10.1145/1963405
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 March 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. expected hits
    2. informational queries
    3. search diversity

    Qualifiers

    • Research-article

    Conference

    WWW '11
    WWW '11: 20th International World Wide Web Conference
    March 28 - April 1, 2011
    Hyderabad, India

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 01 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Comprehensive Information Retrieval Using Fine-Tuned Bert Model and Topic-Assisted Query ExpansionAmbient Intelligence in Health Care10.1007/978-981-19-6068-0_12(117-132)Online publication date: 23-Nov-2022
    • (2021)Novelty and Diversity in Recommender SystemsRecommender Systems Handbook10.1007/978-1-0716-2197-4_16(603-646)Online publication date: 22-Nov-2021
    • (2021)Full coverage of a reader's interests in context‐based information filteringJournal of the Association for Information Science and Technology10.1002/asi.2447072:8(1011-1027)Online publication date: 5-Jul-2021
    • (2020)Community-diversified influence maximization in social networksInformation Systems10.1016/j.is.2020.10152292(101522)Online publication date: Sep-2020
    • (2019)Inferencing underspecified natural language utterances in visual analysisProceedings of the 24th International Conference on Intelligent User Interfaces10.1145/3301275.3302270(40-51)Online publication date: 17-Mar-2019
    • (2019)Search bias quantificationInformation Retrieval10.1007/s10791-018-9341-222:1-2(188-227)Online publication date: 1-Apr-2019
    • (2018)Beyond the Bubble: Assessing the Diversity of Political Search ResultsDigital Journalism10.1080/21670811.2018.1539626(1-20)Online publication date: 28-Nov-2018
    • (2017)Proportional rankingsProceedings of the 26th International Joint Conference on Artificial Intelligence10.5555/3171642.3171701(409-415)Online publication date: 19-Aug-2017
    • (2017)Quantifying Search BiasProceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing10.1145/2998181.2998321(417-432)Online publication date: 25-Feb-2017
    • (2016)Per-round knapsack-constrained linear submodular banditsNeural Computation10.1162/NECO_a_0088728:12(2757-2789)Online publication date: 1-Dec-2016
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media