skip to main content
research-article

A similarity measure for indefinite rankings

Published: 23 November 2010 Publication History

Abstract

Ranked lists are encountered in research and daily life and it is often of interest to compare these lists even when they are incomplete or have only some members in common. An example is document rankings returned for the same query by different search engines. A measure of the similarity between incomplete rankings should handle nonconjointness, weight high ranks more heavily than low, and be monotonic with increasing depth of evaluation; but no measure satisfying all these criteria currently exists. In this article, we propose a new measure having these qualities, namely rank-biased overlap (RBO). The RBO measure is based on a simple probabilistic user model. It provides monotonicity by calculating, at a given depth of evaluation, a base score that is non-decreasing with additional evaluation, and a maximum score that is nonincreasing. An extrapolated score can be calculated between these bounds if a point estimate is required. RBO has a parameter which determines the strength of the weighting to top ranks. We extend RBO to handle tied ranks and rankings of different lengths. Finally, we give examples of the use of the measure in comparing the results produced by public search engines and in assessing retrieval systems in the laboratory.

References

[1]
Bar-Ilan, J. 2005. Comparing rankings of search results on the Web. Inform. Proc. Manag. 41, 1511--1519.
[2]
Bar-Ilan, J., Mat-Hassan, M., and Levene, M. 2006. Methods for comparing rankings of search engine results. Comput. Netw. 50, 10 (July), 1448--1463.
[3]
Blest, D. C. 2000. Rank correlation—an alternative measure. Australian and New Zealand J. Statis. 42, 1, 101--111.
[4]
Buckley, C. 2004. Topic prediction based on comparative retrieval rankings. In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, M. Sanderson, K. Järvelin, J. Allan, and P. Bruza, Eds. 506--507.
[5]
Carterette, B. 2009. On rank correlation and the distance between rankings. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, J. Allan, J. Aslam, M. Sanderson, C. Zhai, and J. Zobel, Eds. 436--443.
[6]
Cliff, N. 1996. Ordinal Methods for Behavioural Data Analysis. Lawrence Erlbaum Associates.
[7]
Fagin, R., Kumar, R., and Sivakumar, D. 2003. Comparing top k lists. SIAM J. Discrete Mathem. 17, 1, 134--160.
[8]
Gibbons, J. D. and Chakraborti, S. 2003. Nonparametric Statistical Inference 4th Ed. CRC.
[9]
Goodman, L. A. and Kruskal, W. H. 1954. Measures of association for cross classifications. J. Am. Statis. Assoc. 49, 268, 732--764.
[10]
Iman, R. L. and Conover, W. J. 1987. A measure of top-down correlation. Technometrics 29, 351--357.
[11]
Järvelin, K. and Kekäläinen, J. 2002. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inform. Syst. 20, 4, 422--446.
[12]
Kendall, M. G. 1948. Rank Correlation Methods 1st Ed. Charles Griffin, London.
[13]
Knuth, D. E. 1997. The Art of Computer Programming, Vol. I: Fundamental Algorithms. 3rd Ed. Addison Wesley, Reading, MA.
[14]
Lester, N., Moffat, A., Webber, W., and Zobel, J. 2005. Space-limited ranked query evaluation using adaptive pruning. In Proceedings of the 6th International Conference on Web Informations Systems. A. H. Ngu, M. Kitsuregawa, E. J. Neuhold, J.-Y. Chung, and Q. Z. Sheng, Eds. Lecture Notes in Computer Science, vol. 3806, 470--477.
[15]
Melucci, M. 2007. On rank correlation in information retrieval evaluation. SIGIR Forum 41, 1, 18--33.
[16]
Melucci, M. 2009. Weighted rank correlation in information retrieval evaluation. In Proceedings of the 5th Asia Information Retrieval Symposium, G. G. Lee, D. Song, C.-Y. Lin, A. Aizawa, K. Kuriyama, M. Yoshioka, and T. Sakai, Eds. Lecture Notes in Computer Science, vol. 5839, 75--86.
[17]
Moffat, A. and Zobel, J. 2008. Rank-biased precision for measurement of retrieval effectiveness. ACM Trans. Inform. Syst. 27, 1, 1--27.
[18]
Quade, D. and Salama, I. A. 1992. A survey of weighted rank correlation. In Order Statistics and Nonparametrics: Theory and Applications, P. K. Sen and I. A. Salama, Eds. Elsevier, 213--224.
[19]
Shieh, G. S. 1998. A weighted Kendall's tau statistic. Statist. Probability Lett. 39, 17--24.
[20]
Tarsitano, A. 2002. Nonlinear rank correlation. Departmental working paper, Universitò degli studi della Calabria.
[21]
Wu, S. and Crestani, F. 2003. Methods for ranking information retrieval systems without relevance judgments. In Proceedings of the ACM Symposium on Applied Computing (SAC). 811--816.
[22]
Yilmaz, E., Aslam, J. A., and Robertson, S. 2008. A new rank correlation coefficient for information retrieval. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, S.-H. Myaeng, D. W. Oard, F. Sebastiani, T.-S. Chua, and M.-K. Leong, Eds. 587--594.
[23]
Zhai, C. and Lafferty, J. 2004. A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inform. Syst. 22, 2, (Apr.). 179--214.

Cited By

View all
  • (2025)Effectiveness of Centrality Measures for Competitive Influence Diffusion in Social NetworksMathematics10.3390/math1302029213:2(292)Online publication date: 17-Jan-2025
  • (2025)Evaluating the relevance of health-related topics using three similarity measuresInformation Development10.1177/02666669251316264Online publication date: 17-Feb-2025
  • (2025)A Framework for the Unsupervised Modeling and Extraction of Polarization Knowledge from News MediaACM Transactions on Social Computing10.1145/37035948:1-2(1-38)Online publication date: 17-Jan-2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Information Systems
ACM Transactions on Information Systems  Volume 28, Issue 4
November 2010
204 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/1852102
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 November 2010
Accepted: 01 March 2010
Revised: 01 October 2009
Received: 01 March 2009
Published in TOIS Volume 28, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Rank correlation
  2. probabilistic models
  3. ranking

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)482
  • Downloads (Last 6 weeks)58
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Effectiveness of Centrality Measures for Competitive Influence Diffusion in Social NetworksMathematics10.3390/math1302029213:2(292)Online publication date: 17-Jan-2025
  • (2025)Evaluating the relevance of health-related topics using three similarity measuresInformation Development10.1177/02666669251316264Online publication date: 17-Feb-2025
  • (2025)A Framework for the Unsupervised Modeling and Extraction of Polarization Knowledge from News MediaACM Transactions on Social Computing10.1145/37035948:1-2(1-38)Online publication date: 17-Jan-2025
  • (2025)On the exact region determined by Spearman’s ρ and Blest’s measure of rank correlation ν for bivariate extreme-value copulasJournal of Multivariate Analysis10.1016/j.jmva.2024.105377205:COnline publication date: 1-Jan-2025
  • (2025)Combining association-rule-guided sequence augmentation with listwise contrastive learning for session-based recommendationInformation Processing & Management10.1016/j.ipm.2024.10399962:3(103999)Online publication date: May-2025
  • (2025)Robust query performance prediction for dense retrievers via adaptive disturbance generationMachine Learning10.1007/s10994-024-06659-z114:3Online publication date: 6-Feb-2025
  • (2025)Should We Trust the Credit Decisions Provided by Machine Learning Models?Computational Economics10.1007/s10614-025-10855-xOnline publication date: 17-Jan-2025
  • (2025)An Introduction to Topic Modeling Using the Latent Dirichlet Allocation in Educational Research: Potential Applications and LimitationsText Mining in Educational Research10.1007/978-981-97-7858-4_2(5-24)Online publication date: 13-Jan-2025
  • (2024)Sex Differences in Conversion Risk from Mild Cognitive Impairment to Alzheimer’s Disease: An Explainable Machine Learning Study with Random Survival Forests and SHAPBrain Sciences10.3390/brainsci1403020114:3(201)Online publication date: 22-Feb-2024
  • (2024)Human-annotated rationales and explainable text classification: a surveyFrontiers in Artificial Intelligence10.3389/frai.2024.12609527Online publication date: 24-May-2024
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media