Skip to main content

Dissimilarity Based Query Selection for Efficient Preference Based IR Evaluation

  • Conference paper
Advances in Information Retrieval (ECIR 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8416))

Included in the following conference series:

Abstract

The evaluation of Information Retrieval (IR) systems has recently been exploring the use of preference judgments over two lists of search results, presented side-by-side to judges. Such preference judgments have been shown to capture a richer set of relevance criteria than traditional methods of collecting relevance labels per single document. However, preference judgments over lists are expensive to obtain and are less reusable as any change to either side necessitates a new judgment. In this paper, we propose a way to measure the dissimilarity between two sides in side-by-side evaluation experiments and show how this measure can be used to prioritize queries to be judged in an offline setting. Our proposed measure, referred to as Weighted Ranking Difference (WRD), takes into account both the ranking differences and the similarity of the documents across the two sides, where a document may, for example, be a URL or a query suggestion. We empirically evaluate our measure on a large-scale, real-world dataset of crowdsourced preference judgments over ranked lists of auto-completion suggestions. We show that the WRD score is indicative of the probability of tie preference judgments and can, on average, save 25% of the judging resources.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aslam, J.A., Pavlu, V., Yilmaz, E.: A statistical method for system evaluation using incomplete judgments. In: Proc. of the 29th ACM SIGIR Conference, SIGIR 2006, pp. 541–548. ACM, New York (2006)

    Google Scholar 

  2. Bailey, P., Craswell, N., White, R.W., Chen, L., Satyanarayana, A., Tahaghoghi, S.M.: Evaluating search systems using result page context. In: Proc. of the Third Symposium on Information Interaction in Context, IIiX 2010, pp. 105–114. ACM, New York (2010)

    Google Scholar 

  3. Bar-Ilan, J., Mat-Hassan, M., Levene, M.: Methods for comparing rankings of search engine results. Comput. Netw. 50(10), 1448–1463 (2006)

    Article  MATH  Google Scholar 

  4. Carterette, B., Allan, J., Sitaraman, R.: Minimal test collections for retrieval evaluation. In: Proc. of the 29th ACM SIGIR Conference, SIGIR 2006, pp. 268–275. ACM, New York (2006)

    Google Scholar 

  5. Chandar, P., Carterette, B.: Using preference judgments for novel document retrieval. In: Proc. of the 35th ACM SIGIR Conference, SIGIR 2012, pp. 861–870. ACM, New York (2012)

    Google Scholar 

  6. Fagin, R., Kumar, R., Sivakumar, D.: Comparing top k lists. SIAM J. Discret. Math. 17(1), 134–160 (2004)

    Article  MathSciNet  Google Scholar 

  7. Guiver, J., Mizzaro, S., Robertson, S.: A few good topics: Experiments in topic set reduction for retrieval evaluation. ACM Trans. Inf. Syst. 27(4), 21:1–21:26 (2009)

    Google Scholar 

  8. Hosseini, M., Cox, I.J., Milic-Frayling, N., Vinay, V., Sweeting, T.: Selecting a subset of queries for acquisition of further relevance judgements. In: Amati, G., Crestani, F. (eds.) ICTIR 2011. LNCS, vol. 6931, pp. 113–124. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  9. Kim, J., Kazai, G., Zitouni, I.: Relevance dimensions in preference-based ir evaluation. In: Proc. of the 36th ACM SIGIR Conference, SIGIR 2013, pp. 913–916. ACM, New York (2013)

    Google Scholar 

  10. Radlinski, F., Bennett, P.N., Carterette, B., Joachims, T.: Redundancy, diversity and interdependent document relevance. SIGIR Forum 43(2), 46–52 (2009)

    Article  Google Scholar 

  11. Radlinski, F., Craswell, N.: Comparing the sensitivity of information retrieval metrics. In: Crestani, F., Marchand-Maillet, S., Chen, H.-H., Efthimiadis, E.N., Savoy, J. (eds.) SIGIR 2010, pp. 667–674. ACM (2010)

    Google Scholar 

  12. Sanderson, M., Paramita, M.L., Clough, P., Kanoulas, E.: Do user preferences and evaluation measures line up? In: Proc. of the 33rd ACM SIGIR Conference, SIGIR 2010, pp. 555–562. ACM, New York (2010)

    Google Scholar 

  13. Shieh, G.: A weighted kendall’s tau statistic. Statistics and Probability Letters 39, 17–24 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  14. Thomas, P., Hawking, D.: Evaluation by comparing result sets in context. In: Proc. of the 15th ACM International Conference on Information and Knowledge Management, CIKM 2006, pp. 94–101. ACM, New York (2006)

    Google Scholar 

  15. Voorhees, E.M., Harman, D.K. (eds.): TREC: Experimentation and Evaluation in Information Retrieval. MIT Press (2005)

    Google Scholar 

  16. Webber, W., Moffat, A., Zobel, J.: A similarity measure for indefinite rankings. ACM Trans. Inf. Syst. 20, 1–20 (2010)

    Article  Google Scholar 

  17. Yilmaz, E., Aslam, J.A., Robertson, S.: A new rank correlation coefficient for information retrieval. In: Proc. of the 31st ACM SIGIR Conference, SIGIR 2008, pp. 587–594. ACM, New York (2008)

    Google Scholar 

  18. Zhu, J., Wang, J., Vinay, V., Cox, I.J.: Topic (query) selection for IR evaluation. In: Proc. of the 32nd ACM SIGIR Conference, SIGIR 2009, pp. 802–803. ACM, New York (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Kazai, G., Sung, H. (2014). Dissimilarity Based Query Selection for Efficient Preference Based IR Evaluation. In: de Rijke, M., et al. Advances in Information Retrieval. ECIR 2014. Lecture Notes in Computer Science, vol 8416. Springer, Cham. https://doi.org/10.1007/978-3-319-06028-6_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-06028-6_15

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-06027-9

  • Online ISBN: 978-3-319-06028-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics