Skip to main content

Abstract

The accuracy of face recognition algorithms has progressed rapidly due to the onset of deep learning and the widespread availability of training data. Though tests of face recognition algorithm performance indicate yearly performance gains, error rates for many of these systems differ based on the demographic composition of the test set. These “demographic differentials” have raised concerns with regard to the “fairness” of these systems. However, no international standard for measuring fairness in biometric systems yet exists. This paper characterizes two proposed measures of face recognition algorithm fairness (fairness measures) from scientists in the U.S. and Europe, using face recognition error rates disaggregated across race and gender from 126 distinct face recognition algorithms. We find that both methods have mathematical characteristics that make them challenging to interpret when applied to these error rates. To address this, we propose a set of interpretability criteria, termed the Functional Fairness Measure Criteria (FFMC), that outlines a set of properties desirable in a face recognition algorithm fairness measure. We further develop a new fairness measure, the Gini Aggregation Rate for Biometric Equitability (GARBE), and show how, in conjunction with the Pareto optimization, this measure can be used to select among alternative algorithms based on the accuracy/fairness trade-space. Finally, to facilitate the development of fairness measures in the face recognition domain, we have open-sourced our dataset of machine-readable, demographically disaggregated error rates. We believe this is currently the largest open-source dataset of its kind.

J. J. Howard and E. J. Laird and Y. B. Sirotin—First authors contributed equally to this research. Authors listed alphabetically.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Barocas, S., Hardt, M., Narayanan, A.: Fairness and Machine Learning. fairmlbook.org (2019). http://www.fairmlbook.org

  2. Buolamwini, J., Gebru, T.: Gender shades: Intersectional accuracy disparities in commercial gender classification. In: Conference on fairness, accountability and transparency, pp. 77–91 (2018)

    Google Scholar 

  3. Cavazos, J.G., Phillips, P.J., Castillo, C.D., O’Toole, A.J.: Accuracy comparison across face recognition algorithms: where are we on measuring race bias? IEEE Trans. Biometrics Behav. Identity Sci. 3(1), 101–111 (2021). https://doi.org/10.1109/TBIOM.2020.3027269

    Article  Google Scholar 

  4. Cook, C.M., Howard, J.J., Sirotin, Y.B., Tipton, J.L., Vemury, A.R.: Demographic effects in facial recognition and their dependence on image acquisition: an evaluation of eleven commercial systems. Trans. Biometrics Behav. Identity Sci. 1(1), 32–41 (2019)

    Google Scholar 

  5. Danks, D., London, A.J.: Algorithmic bias in autonomous systems. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 4691–4697. IJCAI 2017, AAAI Press (2017)

    Google Scholar 

  6. Das, A., Dantcheva, A., Bremond, F.: Mitigating bias in gender, age and ethnicity classification: a multi-task convolution neural network approach. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11129, pp. 573–585. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11009-3_35

    Chapter  Google Scholar 

  7. Deltas, G.: The small-sample bias of the Gini coefficient: results and implications for empirical research. Rev. Econ. Stat. 85(1), 226–234 (2003). http://www.jstor.org/stable/3211637

  8. Department of Economic and Social Affairs: World economic sitatuon and prospects, monthly briefing. United Nations (2018). https://www.un.org/development/desa/dpad/tag/gini-coefficient/

  9. Drozdowski, P., Rathgeb, C., Dantcheva, A., Damer, N., Busch, C.: Demographic bias in biometrics: a survey on an emerging challenge. vol. 1 (03 2020). https://doi.org/10.1109/TTS.2020.2992344

  10. Freiwald, W., Duchaine, B., Yovel, G.: Face processing systems: from neurons to real-world social perception. Ann. Rev. Neurosci. 39, 325–346 (2016)

    Article  Google Scholar 

  11. Gini, C.: Variabilità e mutabilità. Reprinted in Memorie di metodologica statistica (Ed. Pizetti E (1912)

    Google Scholar 

  12. Gong, S., Liu, X., Jain, A.K.: Jointly De-Biasing face recognition and demographic attribute estimation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12374, pp. 330–347. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58526-6_20

    Chapter  Google Scholar 

  13. Grother, P.: Demographic differentials in face recognition algorithms. EAB Virtual Event Series - Demographic Fairness in Biometric Systems (2021)

    Google Scholar 

  14. Grother, P., Ngan, M., Hanaoka, K.: Face Recognition Vendor Test (FRVT) Part 3: Demographic Effects. Tech. rep, United States National Institute of Standards and Technology (2019)

    Book  Google Scholar 

  15. Howard, J.J., Sirotin, Y.B., Tipton, J.L., Vemury, A.R.: Quantifying the extent to which race and gender features determine identity in commercial face recognition algorithms. Tech. rep., United States Department of Homeland Security, Science and Technology Directorate, Technical Paper Series (2021)

    Google Scholar 

  16. Howard, J.J., Sirotin, Y.B., Vemury, A.R.: The effect of broad and specific demographic homogeneity on the imposter distributions and false match rates in face recognition algorithm performance. In: 2019 IEEE 10th International Conference on Biometrics Theory, Applications and Systems (BTAS), pp. 1–8. IEEE (2019)

    Google Scholar 

  17. Hutchinson, B., Mitchell, M.: 50 years of test (un)fairness: lessons for machine learning. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 49–58. FAT* 2019, Association for Computing Machinery, New York, USA (2019). https://doi.org/10.1145/3287560.3287600

  18. Jacobs, A.Z., Wallach, H.: Measurement and fairness. In: Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, pp. 375–385 (2021)

    Google Scholar 

  19. Kahneman, D., Knetsch, J.L., Thaler, R.: Fairness as a constraint on profit seeking: entitlements in the market. Am. Econ. Rev. 76(4), 728–741 (1986)

    Google Scholar 

  20. Kahneman, D., Tversky, A.: Prospect theory: an analysis of decision under risk. In: Handbook of the fundamentals of financial decision making: Part I, pp. 99–127. World Scientific (2013)

    Google Scholar 

  21. Klare, B.F., Burge, M.J., Klontz, J.C., Bruegge, R.W.V., Jain, A.K.: Face recognition performance: role of demographic information. IEEE Trans. Inf. Forensics Secur. 7(6), 1789–1801 (2012)

    Article  Google Scholar 

  22. Krishnapriya, K., Albiero, V., Vangara, K., King, M.C., Bowyer, K.W.: Issues related to face recognition accuracy varying based on race and skin tone. IEEE Trans. Technol. Soc. 1(1), 8–20 (2020)

    Article  Google Scholar 

  23. Liu, B., et al.: Fair loss: Margin-aware reinforcement learning for deep face recognition. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 10051–10060 (2019). https://doi.org/10.1109/ICCV.2019.01015

  24. Organization, I.S., Commission, I.E.: ISO/IEC 2382–37:2015: Information technology - vocabulary - part 37: Biometrics. ISO/IEC, Editor (2015)

    Google Scholar 

  25. Pereira, T.D.F., Marcel, S.: Fairness in biometrics: a figure of merit to assess biometric verification systems. arXiv preprint arXiv:2011.02395 (2020)

  26. Pereira, T.D.F., Marcel, S.: Fairness in biometrics: a figure of merit to assess biometric verification systems. IEEE Trans. Biometrics Behav. Identity Sci. 4(1), 1–1 (2021). https://doi.org/10.1109/TBIOM.2021.3102862

  27. Phillips, P.J., Jiang, F., Narvekar, A., Ayyad, J., O’Toole, A.J.: An other-race effect for face recognition algorithms. ACM Trans. Appl. Percept. (TAP) 8(2), 1–11 (2011)

    Article  Google Scholar 

  28. Raub, M.: Bots, bias and big data: artificial intelligence, algorithmic bias and disparate impact liability in hiring practices. Ark. L. Rev. 71, 529 (2018)

    Google Scholar 

  29. Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: DeepFace: closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)

    Google Scholar 

  30. Verma, S., Rubin, J.: Fairness definitions explained. In: 2018 IEEE/ACM International Workshop on Software Fairness (fairware), pp. 1–7. IEEE (2018)

    Google Scholar 

  31. Wang, T., Zhao, J., Yatskar, M., Chang, K.W., Ordonez, V.: Balanced datasets are not enough: Estimating and mitigating gender bias in deep image representations. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (October 2019)

    Google Scholar 

  32. Warrens, M.J.: On the Negative Bias of the Gini Coefficient due to Grouping. J. Classif. 35(3), 580–586 (2018). https://doi.org/10.1007/s00357-018-9267-9

    Article  MathSciNet  MATH  Google Scholar 

  33. Wei, S., Niethammer, M.: The fairness-accuracy pareto front. arXiv preprint arXiv:2008.10797 (2020)

  34. Wilson, C., et al.: Building and auditing fair algorithms: a case study in candidate screening. In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pp. 666–677. FAccT 2021, Association for Computing Machinery, New York, USA (2021). https://doi.org/10.1145/3442188.3445928

  35. Zafar, M.B., Valera, I., Rogriguez, M.G., Gummadi, K.P.: Fairness constraints: Mechanisms for fair classification. In: Artificial Intelligence and Statistics, pp. 962–970. PMLR (2017)

    Google Scholar 

Download references

Acknowlegments

This research was funded by the Department of Homeland Security, Science and Technology Directorate (DHS S &T) on contract number W911NF-13-D-0006-0003. The views presented do not represent those of DHS, the U.S. Government, or the author’s employers.

The dataset used in this report is available on the Maryland Test Facility’s github: https://github.com/TheMdTF/mdtf-public/tree/master/datasets/nist-frvt-annex15.

Paper contributions: All authors conceived the work; Eli J. Laird and John J. Howard performed the statistical analysis and wrote the paper; Yevgeniy B. Sirotin advised on statistical analysis and edited the paper. Jerry L. Tipton (IDSLabs) and Arun R. Vemury (DHS S &T) also conceived the work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eli J. Laird .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Howard, J.J., Laird, E.J., Rubin, R.E., Sirotin, Y.B., Tipton, J.L., Vemury, A.R. (2023). Evaluating Proposed Fairness Models for Face Recognition Algorithms. In: Rousseau, JJ., Kapralos, B. (eds) Pattern Recognition, Computer Vision, and Image Processing. ICPR 2022 International Workshops and Challenges. ICPR 2022. Lecture Notes in Computer Science, vol 13643. Springer, Cham. https://doi.org/10.1007/978-3-031-37660-3_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-37660-3_31

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-37659-7

  • Online ISBN: 978-3-031-37660-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics