Dissimilarity Based Query Selection for Efficient Preference Based IR Evaluation

Kazai, Gabriella; Sung, Homer

doi:10.1007/978-3-319-06028-6_15

Gabriella Kazai^22,23 &
Homer Sung^22,23

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8416))

Included in the following conference series:

European Conference on Information Retrieval

2927 Accesses
3 Citations
1 Altmetric

Abstract

The evaluation of Information Retrieval (IR) systems has recently been exploring the use of preference judgments over two lists of search results, presented side-by-side to judges. Such preference judgments have been shown to capture a richer set of relevance criteria than traditional methods of collecting relevance labels per single document. However, preference judgments over lists are expensive to obtain and are less reusable as any change to either side necessitates a new judgment. In this paper, we propose a way to measure the dissimilarity between two sides in side-by-side evaluation experiments and show how this measure can be used to prioritize queries to be judged in an offline setting. Our proposed measure, referred to as Weighted Ranking Difference (WRD), takes into account both the ranking differences and the similarity of the documents across the two sides, where a document may, for example, be a URL or a query suggestion. We empirically evaluate our measure on a large-scale, real-world dataset of crowdsourced preference judgments over ranked lists of auto-completion suggestions. We show that the WRD score is indicative of the probability of tie preference judgments and can, on average, save 25% of the judging resources.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aslam, J.A., Pavlu, V., Yilmaz, E.: A statistical method for system evaluation using incomplete judgments. In: Proc. of the 29th ACM SIGIR Conference, SIGIR 2006, pp. 541–548. ACM, New York (2006)
Google Scholar
Bailey, P., Craswell, N., White, R.W., Chen, L., Satyanarayana, A., Tahaghoghi, S.M.: Evaluating search systems using result page context. In: Proc. of the Third Symposium on Information Interaction in Context, IIiX 2010, pp. 105–114. ACM, New York (2010)
Google Scholar
Bar-Ilan, J., Mat-Hassan, M., Levene, M.: Methods for comparing rankings of search engine results. Comput. Netw. 50(10), 1448–1463 (2006)
Article MATH Google Scholar
Carterette, B., Allan, J., Sitaraman, R.: Minimal test collections for retrieval evaluation. In: Proc. of the 29th ACM SIGIR Conference, SIGIR 2006, pp. 268–275. ACM, New York (2006)
Google Scholar
Chandar, P., Carterette, B.: Using preference judgments for novel document retrieval. In: Proc. of the 35th ACM SIGIR Conference, SIGIR 2012, pp. 861–870. ACM, New York (2012)
Google Scholar
Fagin, R., Kumar, R., Sivakumar, D.: Comparing top k lists. SIAM J. Discret. Math. 17(1), 134–160 (2004)
Article MathSciNet Google Scholar
Guiver, J., Mizzaro, S., Robertson, S.: A few good topics: Experiments in topic set reduction for retrieval evaluation. ACM Trans. Inf. Syst. 27(4), 21:1–21:26 (2009)
Google Scholar
Hosseini, M., Cox, I.J., Milic-Frayling, N., Vinay, V., Sweeting, T.: Selecting a subset of queries for acquisition of further relevance judgements. In: Amati, G., Crestani, F. (eds.) ICTIR 2011. LNCS, vol. 6931, pp. 113–124. Springer, Heidelberg (2011)
Chapter Google Scholar
Kim, J., Kazai, G., Zitouni, I.: Relevance dimensions in preference-based ir evaluation. In: Proc. of the 36th ACM SIGIR Conference, SIGIR 2013, pp. 913–916. ACM, New York (2013)
Google Scholar
Radlinski, F., Bennett, P.N., Carterette, B., Joachims, T.: Redundancy, diversity and interdependent document relevance. SIGIR Forum 43(2), 46–52 (2009)
Article Google Scholar
Radlinski, F., Craswell, N.: Comparing the sensitivity of information retrieval metrics. In: Crestani, F., Marchand-Maillet, S., Chen, H.-H., Efthimiadis, E.N., Savoy, J. (eds.) SIGIR 2010, pp. 667–674. ACM (2010)
Google Scholar
Sanderson, M., Paramita, M.L., Clough, P., Kanoulas, E.: Do user preferences and evaluation measures line up? In: Proc. of the 33rd ACM SIGIR Conference, SIGIR 2010, pp. 555–562. ACM, New York (2010)
Google Scholar
Shieh, G.: A weighted kendall’s tau statistic. Statistics and Probability Letters 39, 17–24 (1998)
Article MathSciNet MATH Google Scholar
Thomas, P., Hawking, D.: Evaluation by comparing result sets in context. In: Proc. of the 15th ACM International Conference on Information and Knowledge Management, CIKM 2006, pp. 94–101. ACM, New York (2006)
Google Scholar
Voorhees, E.M., Harman, D.K. (eds.): TREC: Experimentation and Evaluation in Information Retrieval. MIT Press (2005)
Google Scholar
Webber, W., Moffat, A., Zobel, J.: A similarity measure for indefinite rankings. ACM Trans. Inf. Syst. 20, 1–20 (2010)
Article Google Scholar
Yilmaz, E., Aslam, J.A., Robertson, S.: A new rank correlation coefficient for information retrieval. In: Proc. of the 31st ACM SIGIR Conference, SIGIR 2008, pp. 587–594. ACM, New York (2008)
Google Scholar
Zhu, J., Wang, J., Vinay, V., Cox, I.J.: Topic (query) selection for IR evaluation. In: Proc. of the 32nd ACM SIGIR Conference, SIGIR 2009, pp. 802–803. ACM, New York (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Microsoft, Cambridge, UK
Gabriella Kazai & Homer Sung
Bellevue, US
Gabriella Kazai & Homer Sung

Authors

Gabriella Kazai
View author publications
You can also search for this author in PubMed Google Scholar
Homer Sung
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
Maarten de Rijke & Tom Kenter &
Centrum Wiskunde en Informatica, Amsterdam, The Netherlands and Delft University of Technology, Delft, The Netherlands
Arjen P. de Vries
University of Illinois at Urbana-Champaign, Urbana, IL, USA
ChengXiang Zhai
University of Twente, Twente, The Netheralnds and Erasmus University Rotterdam, Rotterdam, The Netherlands
Franciska de Jong
SalesPredict, Haifa, Israel
Kira Radinsky
Microsoft Research, Cambridge, UK
Katja Hofmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kazai, G., Sung, H. (2014). Dissimilarity Based Query Selection for Efficient Preference Based IR Evaluation. In: de Rijke, M., et al. Advances in Information Retrieval. ECIR 2014. Lecture Notes in Computer Science, vol 8416. Springer, Cham. https://doi.org/10.1007/978-3-319-06028-6_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-06028-6_15
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06027-9
Online ISBN: 978-3-319-06028-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics