skip to main content
10.1145/3063955.3064803acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesacm-turcConference Proceedingsconference-collections
research-article

The early fusion strategy for search result diversification

Published: 12 May 2017 Publication History

Abstract

A typical strategy for search result diversification is a two-stage process: first we use a traditional search engine to obtain a ranked list of documents, in which relevance is the only concern; then the results are re-ranked so as to promote diversity. In recent years, some researchers have investigated how to use data fusion to improve search result diversity. Corresponding to the two stages of search result diversification, we may apply data fusion at either of these two stages. All previous investigations focus on fusing results at the second stage, or fusing multiple results that are already diversified. In this paper, we investigate an alternative way of fusion, or fusing multiple results at the first stage. The fused results are diversified by a re-ranking algorithm. Experiments are carried out with three groups of results submitted to the TREC web adhoc task. We find that the proposed alternative is very good. Its performance is slightly better compared with the second stage fusion. Another advantage is it can be implemented more efficiently than the second stage fusion.

References

[1]
R. Agrawal, S. Gollapudi, A. Halverson and S. Ieong. Diversifying search results. In WSDM, pages5--14, 2009.
[2]
J. A. Aslam and M. Montague. Models for metasearch. In SIGIR, pages 276--284, 2001.
[3]
B. T. Bartell, G. W. Cottrell and R. K. Belew. Automatic combination of multiple ranked retrieval systems. In SIGIR, pages 173--181, 1994.
[4]
J. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In SIGIR, pages 335--336, 1998.
[5]
B. Carterette and P. Chandar. Probabilistic models of ranking novel documents for faceted topic retrieval. In CIKM, pages 1287--1296, 2009.
[6]
G. V. Cormack, C. L. A. Clarke, and S. Büttcher. Reciprocal rank fusion outperforms Condorcet and individual rank learning methods. In SIGIR, pages 758--759, 2009.
[7]
V. Dang and W. Bruce Croft. Diversity by proportionality: an election-based approach to search result diversification. In SIGIR, pages 65--74, 2012.
[8]
V. Dang and W. Bruce Croft. Term level search result diversification. In SIGIR, pages 603--612, 2013.
[9]
E. A. Fox, M. P. Koushik, J. Shaw, R. Modlin and D. Rao. Combining evidence from multiple searches. In TREC, 1993.
[10]
E. A. Fox and J. A. Shaw. Combination of multiple searches. In TREC-2, 1994.
[11]
J. He, E. Meij and M. de Rijke. Result diversification based on query-specific cluster ranking. JASIST 62(3):550--571, 2011.
[12]
S. Kharazmi, M. Sanderson, F. Scholer and D. Vallet. Using score differences for search result diversification. In SIGIR, pages 1143--1146, 2014.
[13]
A. Khudyak Kozorovitsky and O. Kurland. Cluster-based fusion of retrieved lists. In SIGIR, pages 893--902, 2011.
[14]
J. H. Lee. Analyses of Multiple Evidence Combination. In SIGIR, pages 267--276, 1997.
[15]
S. Liang, Z. Ren, and M. de Rijke. Fusion helps diversification. In SIGIR, pages 303--312, 2014.
[16]
M. Montague and J. A. Aslam. Condorcet fusion for improved retrieval. In CIKM, pages 538--548, 2002.
[17]
A. M. Ozdemiray and I. S. Altingovde. Explicit search result diversification using score and rank aggregation methods. JASIST 66(6): 1212--1228, 2015.
[18]
R. L. Santos, J. Peng, C. Macdonald and I. Ounis. Explicit search result diversification through subqueries. In ECIR, pages 87--99, 2010.
[19]
C. C. Vogt and G. W. Cottrell. Predicting the performance of linearly combined IR systems. In SIGIR, pages 190--196, 1998.
[20]
S. Wu and C. Huang. Search result diversification via data fusion. In SIGIR, pages 827--830, 2014.
[21]
S. Wu. Data fusion in information retrieval. Springer, 2012.
[22]
J. Wang and J. Zhu, Portfolio theory of information retrieval. In SIGIR, pages 115--122, 2009.
[23]
W. Zheng, X. Wang, H. Fang and H. Cheng. Coverage-based search result diversification. Information Retrieval 15(5) : 433--457, 2012.
[24]
Y. Zhu, Y. Lan, J. Guo, X. Cheng and S. Niu. Learning for search result diversification. In SIGIR, pages 293--302, 2014.
[25]
C. X. Zhai, W. W. Cohen and J. Lafferty. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In SIGIR, pages 10--17, 2003.

Cited By

View all
  • (2019)Search Results Diversification based on Subtopics Attention Network2019 2nd International Conference on Communication, Computing and Digital systems (C-CODE)10.1109/C-CODE.2019.8681009(126-131)Online publication date: Mar-2019

Index Terms

  1. The early fusion strategy for search result diversification

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ACM TURC '17: Proceedings of the ACM Turing 50th Celebration Conference - China
    May 2017
    371 pages
    ISBN:9781450348737
    DOI:10.1145/3063955
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 May 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. data fusion
    2. information search
    3. search result diversification

    Qualifiers

    • Research-article

    Conference

    ACM TUR-C '17

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Search Results Diversification based on Subtopics Attention Network2019 2nd International Conference on Communication, Computing and Digital systems (C-CODE)10.1109/C-CODE.2019.8681009(126-131)Online publication date: Mar-2019

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media