Measuring Fairness in Machine Learning Models via Counterfactual Examples

Haffar, Rami; Singh, Ashneet Khandpur; Domingo-Ferrer, Josep; Jebreel, Najeeb

doi:10.1007/978-3-031-13448-7_10

Rami Haffar⁹,
Ashneet Khandpur Singh⁹,
Josep Domingo-Ferrer⁹ &
…
Najeeb Jebreel⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13408))

Included in the following conference series:

International Conference on Modeling Decisions for Artificial Intelligence

474 Accesses
1 Citations

Abstract

Machine learning has become a vital resource of the modern society. It is present in everything around us, from a smartwatch to a self-driving car. To train a machine learning model, a heap of data is used. This can be worrisome in the case of that learned models can be discriminatory with respect to protected features such as race or gender. In order to develop fair models and verify the fairness of these models, a plethora of work has emerged in recent years. In this work, we propose a method, based on counterfactual examples, that detects any bias in the machine learning model. Our method works for different data types, including tabular data and images.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Aïvodji, U., Bidet, F., Gambs, S., Ngueveu, R.C., Tapp, A.: Local data debiasing for fairness based on generative adversarial training. Algorithms 14(3), 87 (2021)
Article MathSciNet Google Scholar
Ball, G.H., Hall, D.J.: ISODATA: a novel method of data analysis and pattern classification. Stanford Research Institute, Menlo Park, CA (1965)
Google Scholar
Chouldechova, A.: Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big Data 5(2), 153–163 (2017)
Article Google Scholar
Dua, D., Graff, C.: UCI machine learning repository. University of California, School of Information and Computer Science, Irvine, CA (2017). http://archive.ics.uci.edu/ml
Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pp. 214–226. ACM (2012)
Google Scholar
Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006)
Article MathSciNet Google Scholar
Gajane, P., Pechenizkiy, M.: On formalizing fairness in prediction with machine learning. arXiv preprint arXiv:1710.03184 (2018)
Goodfellow, I.J., et al.: Generative adversarial networks. In: Advances of Neural Information Processing Systems-NIPS 2014, vol. 27 (2014)
Google Scholar
Haffar, R., Jebreel, N.M., Domingo-Ferrer, J., Sánchez, D.: Explaining image misclassification in deep learning via adversarial examples. In: Torra, V., Narukawa, Y. (eds.) MDAI 2021. LNCS (LNAI), vol. 12898, pp. 323–334. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-85529-1_26
Chapter Google Scholar
Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. In: Advances in Neural Information Processing Systems-NIPS 2016, vol. 29 (2016)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kusner, M.J., Loftus, J., Russell, C., Silva, R.: Counterfactual fairness. In: Advances in Neural Information Processing Systems-NIPS 2017, vol. 30 (2017)
Google Scholar
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3730–3738 (2015)
Google Scholar
Szegedy, C., et al.: Intriguing properties of neural networks. In: International Conference on Learning Representations (2014). arXiv preprint arXiv: 1312.6199
Verma, S., Rubin, J.: Fairness definitions explained. In: 2018 IEEE/ACM International Workshop on Software Fairness (FairWare 2018), pp. 1–7. IEEE (2018)
Google Scholar
Wachter, S., Mittelstadt, B.D., Russell, C.: Counterfactual explanations without opening the black box: automated decisions and the GDPR. CoRR abs/1711.00399 (2017)
Google Scholar
Xu, D., Yuan, S., Zhang, L., Wu, X.: FairGAN: fairness-aware generative adversarial networks. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 570–575 (2018)
Google Scholar
Zhang, Y., Sang, J.: Towards accuracy-fairness paradox: adversarial example-based data augmentation for visual debiasing. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 4346–4354 (2020)
Google Scholar

Download references

Acknowledgements

We acknowledge support from the European Commission (projects H2020-871042 “SoBigData++” and H2020-101006879 “MobiDataLab”) and from the Government of Catalonia (ICREA Acadèmia Prize to J. Domingo-Ferrer).

Author information

Authors and Affiliations

Department of Computer Engineering and Mathematics, CYBERCAT-Center for Cybersecurity Research of Catalonia, Universitat Rovira i Virgili, Av. Països Catalans 26, 43007, Tarragona, Catalonia
Rami Haffar, Ashneet Khandpur Singh, Josep Domingo-Ferrer & Najeeb Jebreel

Authors

Rami Haffar
View author publications
You can also search for this author in PubMed Google Scholar
Ashneet Khandpur Singh
View author publications
You can also search for this author in PubMed Google Scholar
Josep Domingo-Ferrer
View author publications
You can also search for this author in PubMed Google Scholar
Najeeb Jebreel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rami Haffar .

Editor information

Editors and Affiliations

Umeå University, Umeå, Sweden
Vicenç Torra
Tamagawa University, Tokyo, Japan
Yasuo Narukawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Haffar, R., Singh, A.K., Domingo-Ferrer, J., Jebreel, N. (2022). Measuring Fairness in Machine Learning Models via Counterfactual Examples. In: Torra, V., Narukawa, Y. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2022. Lecture Notes in Computer Science(), vol 13408. Springer, Cham. https://doi.org/10.1007/978-3-031-13448-7_10

Download citation

DOI: https://doi.org/10.1007/978-3-031-13448-7_10
Published: 23 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-13447-0
Online ISBN: 978-3-031-13448-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Measuring Fairness in Machine Learning Models via Counterfactual Examples