research-article

Public Access

Measuring Fairness in Ranked Results: An Analytical and Empirical Comparison

Authors:

Michael D. EkstrandAuthors Info & Claims

SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 726 - 736

https://doi.org/10.1145/3477495.3532018

Published: 07 July 2022 Publication History

Abstract

Information access systems, such as search and recommender systems, often use ranked lists to present results believed to be relevant to the user's information need. Evaluating these lists for their fairness along with other traditional metrics provides a more complete understanding of an information access system's behavior beyond accuracy or utility constructs. To measure the (un)fairness of rankings, particularly with respect to the protected group(s) of producers or providers, several metrics have been proposed in the last several years. However, an empirical and comparative analyses of these metrics showing the applicability to specific scenario or real data, conceptual similarities, and differences is still lacking.

We aim to bridge the gap between theoretical and practical ap-plication of these metrics. In this paper we describe several fair ranking metrics from the existing literature in a common notation, enabling direct comparison of their approaches and assumptions, and empirically compare them on the same experimental setup and data sets in the context of three information access tasks. We also provide a sensitivity analysis to assess the impact of the design choices and parameter settings that go in to these metrics and point to additional work needed to improve fairness measurement.

References

[1]

Himan Abdollahpouri. 2019. Popularity Bias in Ranking and Recommendation. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. 529--530. https://doi.org/10.1145/3306618.3314309

Digital Library

[2]

Solon Barocas and Andrew D Selbst. 2016. Big Data's Disparate Impact. Calif. L. Rev., Vol. 104 (2016), 671.

[3]

Alex Beutel, Jilin Chen, Tulsee Doshi, Hai Qian, Li Wei, Yi Wu, Lukasz Heldt, Zhe Zhao, Lichan Hong, Ed H Chi, et almbox. 2019. Fairness in Recommendation Ranking Through Pairwise Comparisons. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining . 2212--2220. https://doi.org/10.1145/3292500.3330745

Digital Library

[4]

Asia J Biega, Fernando Diaz, Michael D Ekstrand, and Sebastian Kohlmeier. 2020. Overview of the Trec 2019 Fair Ranking Track. arXiv preprint arXiv:2003.11650 (2020).

[5]

Asia J Biega, Krishna P Gummadi, and Gerhard Weikum. 2018. Equity of Attention: Amortizing Individual Fairness in Rankings. In Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. 405--414. https://doi.org/10.1145/3209978.3210063

Digital Library

[6]

Reuben Binns. 2020. On the Apparent Conflict Between Individual and Group Fairness. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency . 514--524. https://doi.org/10.1145/3351095.3372864

Digital Library

[7]

Robin Burke. 2017. Multisided Fairness for Recommendation. (July 2017). arxiv: 1707.00093 [cs.CY] http://arxiv.org/abs/1707.00093

[8]

Robin D Burke, Himan Abdollahpouri, Bamshad Mobasher, and Trinadh Gupta. 2016. Towards Multi-Stakeholder Utility Evaluation of Recommender Systems. In ACM UMAP Conference on User Modeling, Adaptation and Personalization (Extended Proceedings) .

[9]

Jaime Carbonell and Jade Goldstein. 1998. The Use of MMR, Diversity-based Reranking for Reordering Documents and Producing Summaries. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval . 335--336. https://doi.org/10.1145/290941.291025

Digital Library

[10]

Mukund Deshpande and George Karypis. 2004. Item-based Top-n Recommendation Algorithms. ACM Transactions on Information Systems (TOIS), Vol. 22, 1 (2004), 143--177. https://doi.org/10.1145/963770.963776

Digital Library

[11]

Fernando Diaz, Bhaskar Mitra, Michael D. Ekstrand, Asia J. Biega, and Ben Carterette. 2020. Evaluating Stochastic Rankings with Expected Exposure. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (Virtual Event, Ireland) (CIKM '20). Association for Computing Machinery, New York, NY, USA, 275--284. https://doi.org/10.1145/3340531.3411962

Digital Library

[12]

Michael D. Ekstrand. 2020. LensKit for Python: Next-Generation Software for Recommender Systems Experiments. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (Virtual Event, Ireland) (CIKM '20). Association for Computing Machinery, New York, NY, USA, 2999--3006. https://doi.org/10.1145/3340531.3412778

Digital Library

[13]

Michael D Ekstrand, Anubrata Das, Robin Burke, and Fernando Diaz. 2022. Fairness and Discrimination in Information Access Systems. Foundations and Trends in Information Retrieval (2022). https://arxiv.org/abs/2105.05779

[14]

Michael D. Ekstrand and Daniel Kluver. 2020. Exploring Author Gender in Book Rating and Recommendation. User Modeling and User-Adapted Interaction (feb 2020). https://doi.org/10.1007/s11257-020-09284--2

[15]

Yunhe Feng, Daniel Saelid, Ke Li, Ruoyuan Gao, and Chirag Shah. 2020. University of Washington at TREC 2020 fairness ranking track. arXiv preprint arXiv:2011.02066 (2020).

[16]

Sorelle A. Friedler, Carlos Scheidegger, Suresh Venkatasubramanian, Sonam Choudhary, Evan P. Hamilton, and Derek Roth. 2019. A Comparative Study of Fairness-Enhancing Interventions in Machine Learning. In Proceedings of the Conference on Fairness, Accountability, and Transparency (Atlanta, GA, USA) (FAT* '19). Association for Computing Machinery, New York, NY, USA, 329--338. https://doi.org/10.1145/3287560.3287589

Digital Library

[17]

Avijit Ghosh, Ritam Dutt, and Christo Wilson. 2021. When Fair Ranking Meets Uncertain Inference .Association for Computing Machinery, New York, NY, USA, 1033--1043. https://doi.org/10.1145/3404835.3462850

Digital Library

[18]

Moritz Hardt, Eric Price, and Nathan Srebro. 2016. Equality of Opportunity in Supervised Learning. arXiv preprint arXiv:1610.02413 (2016).

[19]

Jonathan L. Herlocker, Joseph A. Konstan, Al Borchers, and John Riedl. 1999. An Algorithmic Framework for Performing Collaborative Filtering. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, New York, NY, USA, 230--237.

Digital Library

[20]

Ömer Kirnap, Fernando Diaz, Asia Biega, Michael Ekstrand, Ben Carterette, and Emine Yilmaz. 2021. Estimation of Fair Ranking Metrics with Incomplete Judgments. In Proceedings of the Web Conference 2021 (Ljubljana, Slovenia) (WWW '21). Association for Computing Machinery, New York, NY, USA, 1065--1075. https://doi.org/10.1145/3442381.3450080

Digital Library

[21]

Till Kletti and Jean-Michel Renders. 2020. Naver Labs Europe at TREC 2020 Fair Ranking Track. (2020).

[22]

Caitlin Kuhlman, Walter Gerych, and Elke Rundensteiner. 2021. Measuring Group Advantage: A Comparative Study of Fair Ranking Metrics. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (AIES'21) .

Digital Library

[23]

Graham McDonald and Iadh Ounis. 2020. University of Glasgow Terrier Team at the TREC 2020 Fair Ranking Track. In The Twenty-Ninth Text REtrieval Conference (TREC 2020) Proceedings, Vol. 1266.

[24]

Shira Mitchell, Eric Potash, Solon Barocas, Alexander D'Amour, and Kristian Lum. 2021. Algorithmic Fairness: Choices, Assumptions, and Definitions. Annual Review of Statistics and Its Application, Vol. 8 (2021).

[25]

Harikrishna Narasimhan, Andrew Cotter, Maya R Gupta, and Serena Wang. 2020. Pairwise Fairness for Ranking and Regression. In AAAI. 5248--5255.

[26]

Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian Personalized Ranking from Implicit Feedback. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (Montreal, Quebec, Canada) (UAI '09). AUAI Press, Arlington, Virginia, USA, 452--461.

Digital Library

[27]

Piotr Sapiezynski, Wesley Zeng, Ronald E Robertson, Alan Mislove, and Christo Wilson. 2019. Quantifying the Impact of User Attention on Fair Group Representation in Ranked Lists. In Companion Proceedings of The 2019 World Wide Web Conference (San Francisco, USA) (WWW '19). Association for Computing Machinery, New York, NY, USA, 553--562. https://doi.org/10.1145/3308560.3317595

Digital Library

[28]

Mahmoud F Sayed and Douglas W Oard. 2020. The University of Maryland at the TREC 2020 Fair Ranking Track. (2020).

[29]

Andrew D. Selbst, Danah Boyd, Sorelle A. Friedler, Suresh Venkatasubramanian, and Janet Vertesi. 2019. Fairness and Abstraction in Sociotechnical Systems. In Proceedings of the Conference on Fairness, Accountability, and Transparency (Atlanta, GA, USA) (FAT* '19). Association for Computing Machinery, New York, NY, USA, 59--68. https://doi.org/10.1145/3287560.3287598

Digital Library

[30]

Ashudeep Singh and Thorsten Joachims. 2018. Fairness of Exposure in Rankings. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (London, United Kingdom) (KDD '18). Association for Computing Machinery, New York, NY, USA, 2219--2228. https://doi.org/10.1145/3219819.3220088

Digital Library

[31]

Gábor Takács, István Pilászy, and Domonkos Tikk. 2011. Applications of the Conjugate Gradient Method for Implicit Feedback Collaborative Filtering. In Proceedings of the Fifth ACM Conference on Recommender Systems (Chicago, Illinois, USA) (RecSys '11). Association for Computing Machinery, New York, NY, USA, 297--300. https://doi.org/10.1145/2043932.2043987

Digital Library

[32]

Mengting Wan and Julian McAuley. 2018. Item Recommendation on Monotonic Behavior Chains. In Proceedings of the 12th ACM Conference on Recommender Systems (Vancouver, British Columbia, Canada) (RecSys '18). Association for Computing Machinery, New York, NY, USA, 86--94. https://doi.org/10.1145/3240323.3240369

Digital Library

[33]

A Xiang and I Raji. 2019. On the Legal Compatibility of Fairness Definitions. In Workshop on Human-Centric Machine Learning at the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019) . https://arxiv.org/abs/1912.00761

[34]

Ke Yang and Julia Stoyanovich. 2017. Measuring Fairness in Ranked Outputs. In Proceedings of the 29th International Conference on Scientific and Statistical Database Management. 1--6.

Digital Library

[35]

Yisong Yue, Rajan Patel, and Hein Roehrig. 2010. Beyond Position Bias: Examining Result Attractiveness as a Source of Presentation Bias in Clickthrough Data. In Proceedings of the 19th International Conference on World Wide Web (Raleigh, North Carolina, USA) (WWW '10). Association for Computing Machinery, New York, NY, USA, 1011--1018. https://doi.org/10.1145/1772690.1772793

Digital Library

[36]

Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P. Gummadi. 2017. Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment. In Proceedings of the 26th International Conference on World Wide Web (Perth, Australia) (WWW '17). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 1171--1180. https://doi.org/10.1145/3038912.3052660

Digital Library

[37]

Meike Zehlike, Francesco Bonchi, Carlos Castillo, Sara Hajian, Mohamed Megahed, and Ricardo Baeza-Yates. 2017. FA*IR: A Fair Top-k Ranking Algorithm. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (Singapore, Singapore) (CIKM '17). Association for Computing Machinery, New York, NY, USA, 1569--1578. https://doi.org/10.1145/3132847.3132938

Digital Library

[38]

Meike Zehlike, Ke Yang, and Julia Stoyanovich. 2021. Fairness in Ranking: A Survey. arXiv preprint arXiv:2103.14000 (2021).

Cited By

Schumacher TLutz MSikdar SStrohmaier M(2025)Properties of Group Fairness Measures for RankingsACM Transactions on Social Computing10.1145/36748838:1-2(1-45)Online publication date: 17-Jan-2025
https://dl.acm.org/doi/10.1145/3674883
Li JRen YSanderson MDeng K(2024)Explaining Recommendation Fairness from a User/Item PerspectiveACM Transactions on Information Systems10.1145/369887743:1(1-30)Online publication date: 5-Oct-2024
https://dl.acm.org/doi/10.1145/3698877
Liang SPan Zliu wYin Jde Rijke M(2024)A Survey on Variational Autoencoders in Recommender SystemsACM Computing Surveys10.1145/366336456:10(1-40)Online publication date: 24-Jun-2024
https://dl.acm.org/doi/10.1145/3663364
Show More Cited By

Index Terms

Measuring Fairness in Ranked Results: An Analytical and Empirical Comparison
1. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results
2. Social and professional topics
  1. User characteristics

Recommendations

Measuring Group Advantage: A Comparative Study of Fair Ranking Metrics
AIES '21: Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society

Ranking evaluation metrics play an important role in information retrieval, providing optimization objectives during development and means of assessment of deployed performance. Recently, fairness of rankings has been recognized as crucial, especially ...
A novel demand-aware fairness metric for IEEE 802.11 wireless networks
SAC '13: Proceedings of the 28th Annual ACM Symposium on Applied Computing

Even though literature that focuses on improving fairness among wireless stations in single hop IEEE 802.11 networks is rife, little is discussed on how to measure fairness in such networks when stations have unequal demands for resources. Typically, ...
Evaluation metrics for measuring bias in search engine results
Abstract
Search engines decide what we see for a given search query. Since many people are exposed to information through search engines, it is fair to expect that search engines are neutral. However, search engine results do not necessarily cover all the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2022

3569 pages

ISBN:9781450387323

DOI:10.1145/3477495

General Chairs:
Enrique Amigo
UNED
,
Pablo Castells
UAM and Amazon
,
Julio Gonzalo
UNED
,
Program Chairs:
Ben Carterette
Spotify
,
J. Shane Culpepper
RMIT University
,
Gabriella Kazai
Waseda University

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

NSF (National Science Foundation)

Conference

SIGIR '22

Sponsor:

SIGIR

SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 11 - 15, 2022

Madrid, Spain

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

28
Total Citations
View Citations
1,227
Total Downloads

Downloads (Last 12 months)421
Downloads (Last 6 weeks)75

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Schumacher TLutz MSikdar SStrohmaier M(2025)Properties of Group Fairness Measures for RankingsACM Transactions on Social Computing10.1145/36748838:1-2(1-45)Online publication date: 17-Jan-2025
https://dl.acm.org/doi/10.1145/3674883
Li JRen YSanderson MDeng K(2024)Explaining Recommendation Fairness from a User/Item PerspectiveACM Transactions on Information Systems10.1145/369887743:1(1-30)Online publication date: 5-Oct-2024
https://dl.acm.org/doi/10.1145/3698877
Liang SPan Zliu wYin Jde Rijke M(2024)A Survey on Variational Autoencoders in Recommender SystemsACM Computing Surveys10.1145/366336456:10(1-40)Online publication date: 24-Jun-2024
https://dl.acm.org/doi/10.1145/3663364
Ferraro AEkstrand MBauer C(2024)It's Not You, It's Me: The Impact of Choice Models and Ranking Strategies on Gender Imbalance in Music RecommendationProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688163(884-889)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3640457.3688163
Stray JHalevy AAssar PHadfield-Menell DBoutilier CAshar ABakalar CBeattie LEkstrand MLeibowicz CMoon Sehat CJohansen SKerlin LVickrey DSingh SVrijenhoek SZhang AAndrus MHelberger NProutskova PMitra TVasan N(2024)Building Human Values into Recommender Systems: An Interdisciplinary SynthesisACM Transactions on Recommender Systems10.1145/36322972:3(1-57)Online publication date: 5-Jun-2024
https://dl.acm.org/doi/10.1145/3632297
Alkhathlan MCachel KShrestha HHarrison LRundensteiner E(2024)Balancing Act: Evaluating People’s Perceptions of Fair Ranking MetricsProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3659018(1940-1970)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3630106.3659018
Cachel KRundensteiner E(2024)PreFAIR: Combining Partial Preferences for Fair Consensus Decision-makingProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3658961(1133-1149)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3630106.3658961
Mansoury MMobasher Bvan Hoof HSerra ESpezzano F(2024)Mitigating Exposure Bias in Online Learning to Rank Recommendation: A Novel Reward Model for Cascading BanditsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679763(1638-1648)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679763
Cachel KRundensteiner ESerra ESpezzano F(2024)Wise Fusion: Group Fairness Enhanced Rank FusionProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679649(163-174)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679649
Cachel KRundensteiner ESerra ESpezzano F(2024)FairRankTune: A Python Toolkit for Fair Ranking TasksProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679238(5195-5199)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679238
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten