Abstract
We propose a general stochastic approach which defines relevance as a set of binomial random variables where the expectation p of each variable indicates the quantity of relevance for each relevance grade. This represents the first step in the direction of modelling evaluation measures as a transformation of random variables, turning them into random evaluation measures. We show that a consequence of this new approach is to remove the distinction between binary and multi-graded measures and, at the same time, to deal with incomplete information, providing a single unified framework for all these different aspects. We experiment on TREC collections to show how these new random measures correlate to existing ones and which desirable properties, such as robustness to pool downsampling and discriminative power, they have.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Alonso, O., Mizzaro, S.: Using crowdsourcing for TREC relevance assessment. IPM 48(6), 1053–1066 (2012)
Buckley, C., Voorhees, E.M.: Retrieval evaluation with incomplete information. SIGIR 2004, 25–32 (2004)
Chapelle, O., Metzler, D., Zhang, Y., Grinspan, P.: Expected reciprocal rank for graded relevance. CIKM 2009, 621–630 (2009)
Ferrante, M., Ferro, N., Maistro, M.: Towards a formal framework for utility-oriented measurements of retrieval effectiveness. ICTIR 2015, 21–30 (2015)
Hosseini, M., Cox, I.J., Milić-Frayling, N., Kazai, G., Vinay, V.: On aggregating labels from multiple crowd workers to infer relevance of documents. In: Baeza-Yates, R., de Vries, A.P., Zaragoza, H., Cambazoglu, B.B., Murdock, V., Lempel, R., Silvestri, F. (eds.) ECIR 2012. LNCS, vol. 7224, pp. 182–194. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28997-2_16
Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. TOIS 20(4), 422–446 (2002)
Kazai, G., Craswell, N., Yilmaz, E., Tahaghoghi, S.S.M.: An analysis of systematic judging errors in information retrieval. CIKM 2012, 105–114 (2012)
Kendall, M.G.: Rank Correlation Methods. Griffin, Oxford (1948)
Maddalena, E., Mizzaro, S., Scholer, F., Turpin, A.: On crowdsourcing relevance magnitudes for information retrieval evaluation. TOIS 35(3), 19:1–19:32 (2017)
Moffat, A., Zobel, J.: Rank-biased precision for measurement of retrieval effectiveness. TOIS 27(1), 201–227 (2008)
Park, L.A.F.: Uncertainty in rank-biased precision. ADCS 2016, 73–76 (2016)
Robertson, S.E., Kanoulas, E., Yilmaz, E.: Extending average precision to graded relevance judgments. SIGIR 2010, 603–610 (2010)
Sakai, T.: Evaluating evaluation metrics based on the bootstrap. SIGIR 2006, 525–532 (2006)
Sakai, T.: Alternatives to Bpref. SIGIR 2007, 71–78 (2007)
Sakai, T., Kando, N.: On information retrieval metrics designed for evaluation with incomplete relevance assessments. Inf. Retriev. 11(5), 447–470 (2008)
Voorhees, E.M.: Variations in relevance judgments and the measurement of retrieval effectiveness. SIGIR 1998, 315–323 (1998)
Webber, W., Chandar, P., Carterette, B.A.: Alternative assessor disagreement and retrieval depth. CIKM 2012, 125–134 (2012)
Yilmaz, E., Aslam, J.A.: Estimating average precision with incomplete and imperfect judgments. CIKM 2006, 102–111 (2006)
Yilmaz, E., Aslam, J.A., Robertson, S.E.: A new rank correlation coefficient for information retrieval. SIGIR 2008, 587–594 (2008)
Zobel, J.: How reliable are the results of large-scale information retrieval experiments. SIGIR 1998, 307–314 (1998)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Ferrante, M., Ferro, N., Pontarollo, S. (2018). Modelling Randomness in Relevance Judgments and Evaluation Measures. In: Pasi, G., Piwowarski, B., Azzopardi, L., Hanbury, A. (eds) Advances in Information Retrieval. ECIR 2018. Lecture Notes in Computer Science(), vol 10772. Springer, Cham. https://doi.org/10.1007/978-3-319-76941-7_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-76941-7_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76940-0
Online ISBN: 978-3-319-76941-7
eBook Packages: Computer ScienceComputer Science (R0)