Abstract
Machine learning has become a vital resource of the modern society. It is present in everything around us, from a smartwatch to a self-driving car. To train a machine learning model, a heap of data is used. This can be worrisome in the case of that learned models can be discriminatory with respect to protected features such as race or gender. In order to develop fair models and verify the fairness of these models, a plethora of work has emerged in recent years. In this work, we propose a method, based on counterfactual examples, that detects any bias in the machine learning model. Our method works for different data types, including tabular data and images.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aïvodji, U., Bidet, F., Gambs, S., Ngueveu, R.C., Tapp, A.: Local data debiasing for fairness based on generative adversarial training. Algorithms 14(3), 87 (2021)
Ball, G.H., Hall, D.J.: ISODATA: a novel method of data analysis and pattern classification. Stanford Research Institute, Menlo Park, CA (1965)
Chouldechova, A.: Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big Data 5(2), 153–163 (2017)
Dua, D., Graff, C.: UCI machine learning repository. University of California, School of Information and Computer Science, Irvine, CA (2017). http://archive.ics.uci.edu/ml
Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pp. 214–226. ACM (2012)
Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006)
Gajane, P., Pechenizkiy, M.: On formalizing fairness in prediction with machine learning. arXiv preprint arXiv:1710.03184 (2018)
Goodfellow, I.J., et al.: Generative adversarial networks. In: Advances of Neural Information Processing Systems-NIPS 2014, vol. 27 (2014)
Haffar, R., Jebreel, N.M., Domingo-Ferrer, J., Sánchez, D.: Explaining image misclassification in deep learning via adversarial examples. In: Torra, V., Narukawa, Y. (eds.) MDAI 2021. LNCS (LNAI), vol. 12898, pp. 323–334. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-85529-1_26
Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. In: Advances in Neural Information Processing Systems-NIPS 2016, vol. 29 (2016)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kusner, M.J., Loftus, J., Russell, C., Silva, R.: Counterfactual fairness. In: Advances in Neural Information Processing Systems-NIPS 2017, vol. 30 (2017)
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3730–3738 (2015)
Szegedy, C., et al.: Intriguing properties of neural networks. In: International Conference on Learning Representations (2014). arXiv preprint arXiv: 1312.6199
Verma, S., Rubin, J.: Fairness definitions explained. In: 2018 IEEE/ACM International Workshop on Software Fairness (FairWare 2018), pp. 1–7. IEEE (2018)
Wachter, S., Mittelstadt, B.D., Russell, C.: Counterfactual explanations without opening the black box: automated decisions and the GDPR. CoRR abs/1711.00399 (2017)
Xu, D., Yuan, S., Zhang, L., Wu, X.: FairGAN: fairness-aware generative adversarial networks. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 570–575 (2018)
Zhang, Y., Sang, J.: Towards accuracy-fairness paradox: adversarial example-based data augmentation for visual debiasing. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 4346–4354 (2020)
Acknowledgements
We acknowledge support from the European Commission (projects H2020-871042 “SoBigData++” and H2020-101006879 “MobiDataLab”) and from the Government of Catalonia (ICREA Acadèmia Prize to J. Domingo-Ferrer).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Haffar, R., Singh, A.K., Domingo-Ferrer, J., Jebreel, N. (2022). Measuring Fairness in Machine Learning Models via Counterfactual Examples. In: Torra, V., Narukawa, Y. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2022. Lecture Notes in Computer Science(), vol 13408. Springer, Cham. https://doi.org/10.1007/978-3-031-13448-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-13448-7_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-13447-0
Online ISBN: 978-3-031-13448-7
eBook Packages: Computer ScienceComputer Science (R0)