Rank-Biased Precision Reloaded: Reproducibility and Generalization

Ferro, Nicola; Silvello, Gianmaria

doi:10.1007/978-3-319-16354-3_83

Nicola Ferro¹⁹ &
Gianmaria Silvello¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9022))

Included in the following conference series:

European Conference on Information Retrieval

3950 Accesses
2 Altmetric

Abstract

In this work we reproduce the experiments presented in the paper entitled “Rank-Biased Precision for Measurement of Retrieval Effectiveness”. This paper introduced a new effectiveness measure – Rank- Biased Precision (RBP) – which has become a reference point in the IR experimental evaluation panorama.

We will show that the experiments presented in the original RBP paper are repeatable and we discuss points of strength and limitations of the approach taken by the authors. We also present a generalization of the results by adopting four experimental collections and different analysis methodologies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

How do interval scales help us with better understanding IR evaluation measures?

Article 04 September 2019

Estimating reliability of the retrieval systems effectiveness rank based on performance in multiple experiments

Article 20 December 2016

ranx: A Blazing-Fast Python Library for Ranking Evaluation and Comparison

References

Braschler, M.: CLEF 2003 – Overview of Results. In: Peters, C., Gonzalo, J., Braschler, M., Kluck, M. (eds.) CLEF 2003. LNCS, vol. 3237, pp. 44–63. Springer, Heidelberg (2004)
Chapter Google Scholar
Buckley, C., Voorhees, E.M.: Retrieval Evaluation with Incomplete Information. In: Proc. 27th Ann. Int. ACM Conference on Research and Development in IR (SIGIR 2004), pp. 25–32. ACM Press, USA (2004)
Google Scholar
Buckley, C., Voorhees, E.M.: Retrieval System Evaluation. In: TREC. Experiment and Evaluation in Information Retrieval, pp. 53–78. MIT Press, Cambridge (2005)
Google Scholar
Carterette, B.A.: System Effectiveness, User Models, and User Utility: A Conceptual Framework for Investigation. In: Proc. 34th Ann. Int. ACM Conference on Research and Development in IR (SIGIR 2011), pp. 903–912. ACM Press, USA (2011)
Google Scholar
Chapelle, O., Metzler, D., Zhang, Y., Grinspan, P.: Expected Reciprocal Rank for Graded Relevance. In: Proc. 18th Int. Conference on Information and Knowledge Management (CIKM 2009), pp. 621–630. ACM Press, USA (2009)
Google Scholar
Clarke, C.L.A., Craswell, N., Voorhees, H.: Overview of the TREC 2012 Web Track. In: The Twenty-First Text REtrieval Conference Proceedings (TREC 2012), NIST, SP 500-298, USA, pp. 1–8 (2013)
Google Scholar
Ferro, N., Peters, C.: CLEF 2009 Ad Hoc Track Overview: TEL and Persian Tasks. In: Peters, C., Di Nunzio, G.M., Kurimo, M., Mandl, T., Mostefa, D., Peñas, A., Roda, G. (eds.) CLEF 2009. LNCS, vol. 6241, pp. 13–35. Springer, Heidelberg (2010)
Chapter Google Scholar
Gosset, W.S.: The Probable Error of a Mean. Biometrika (1), 1–25 (1908)
Google Scholar
Järvelin, K., Kekäläinen, J.: Cumulated Gain-Based Evaluation of IR Techniques. ACM Transactions on Information Systems (TOIS) 20(4), 422–446 (2002)
Article Google Scholar
Kendall, M.G.: Rank correlation methods. Griffin, Oxford, England (1948)
Google Scholar
Moffat, A., Thomas, P., Scholer, F.: Users Versus Models: What Observation Tells Us About Effectiveness Metrics. In: Proc. 22h Int. Conference on Information and Knowledge Management (CIKM 2013), pp. 659–668. ACM Press (2013)
Google Scholar
Moffat, A., Zobel, J.: Rank-Biased Precision for Measurement of Retrieval Effectiveness. ACM Transactions on Information Systems 27(1), 1–27 (2008)
Article Google Scholar
Sakai, T., Kando, N.: On Information Retrieval Metrics Designed for Evaluation with Incomplete Relevance Assessments. Inf. Retrieval 11(5), 447–470 (2008)
Article Google Scholar
Voorhees, E.: Evaluation by Highly Relevant Documents. In: Proc. 24th Ann. Int. ACM Conference on Research and Development in IR (SIGIR 2001), pp. 74–82. ACM Press, USA (2001)
Google Scholar
Voorhees, E.M.: Overview of the TREC 2004 Robust Track. In: The 13th Text REtrieval Conference Proceedings (TREC 2004), USA, pp. 500–261 (2004)
Google Scholar
Voorhees, E.M., Harman, D.K.: Overview of the Fifth Text REtrieval Conference (TREC-5). In: The 5th Text REtrieval Conference (TREC-5), NIST, SP 500-238, pp. 1–28 (1996)
Google Scholar
Voorhees, E.M., Tice, D.M.: The TREC-8 Question Answering Track Evaluation. In: The 8th Text REtrieval Conference (TREC-8), NIST, SP 500-246, USA, pp. 83–105 (1999)
Google Scholar
Wilcoxon, F.: Individual Comparisons by Ranking Methods. Biometrics Bulletin 1(6), 80–83 (1945)
Article Google Scholar
Yilmaz, E., Aslam, J.A.: Estimating Average Precision when Judgments are Incomplete. Knowledge and Information Systems 16(2), 173–211 (2008)
Article Google Scholar
Yilmaz, E., Shokouhi, M., Craswell, N., Robertson, S.: Expected Browsing Utility for Web Search Evaluation. In: Proc. 19th Int. Conference on Information and Knowledge Management (CIKM 2010), pp. 1561–1565. ACM Press, USA (2010)
Google Scholar
Zhang, Y., Park, L., Moffat, A.: Click-based evidence for decaying weight distributions in search effectiveness metrics. Inf. Retrieval 13(1), 46–69 (2010)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Engineering, University of Padua, Italy
Nicola Ferro & Gianmaria Silvello

Authors

Nicola Ferro
View author publications
You can also search for this author in PubMed Google Scholar
Gianmaria Silvello
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Vienna University of Technology, Institute of Software Technology and Interactive Systems, Favoritenstraße 9-11/188, 1040, Vienna, Austria
Allan Hanbury
Lumi, Semion Ltd., 111 Charterhouse Street, EC1M 6AW, London, UK
Gabriella Kazai
Institute of Software Technology and Interactive Systems, Vienna University of Technology, Favoritenstraße 9-11/188, 1040, Vienna, Austria
Andreas Rauber
Universität Duisburg-Essen, Lotharstraße 65, 47057, Duisburg, Germany
Norbert Fuhr

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ferro, N., Silvello, G. (2015). Rank-Biased Precision Reloaded: Reproducibility and Generalization. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds) Advances in Information Retrieval. ECIR 2015. Lecture Notes in Computer Science, vol 9022. Springer, Cham. https://doi.org/10.1007/978-3-319-16354-3_83

Download citation

DOI: https://doi.org/10.1007/978-3-319-16354-3_83
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16353-6
Online ISBN: 978-3-319-16354-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Rank-Biased Precision Reloaded: Reproducibility and Generalization

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

How do interval scales help us with better understanding IR evaluation measures?

Estimating reliability of the retrieval systems effectiveness rank based on performance in multiple experiments

ranx: A Blazing-Fast Python Library for Ranking Evaluation and Comparison

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Rank-Biased Precision Reloaded: Reproducibility and Generalization

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

How do interval scales help us with better understanding IR evaluation measures?

Estimating reliability of the retrieval systems effectiveness rank based on performance in multiple experiments

ranx: A Blazing-Fast Python Library for Ranking Evaluation and Comparison

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation