Abstract
Gender bias is one of the types of bias studied in fair machine learning (ML), which seeks equity in the predictions made by ML models. Bias mitigation is often based on protecting the sensitive attribute (e.g. gender or race) by optimising some fairness metrics. However, reducing the relevance of the sensitive attribute can lead to higher error rates. This paper analyses the relationship between gender bias and misclassification using explainable artificial intelligence. The proposed method applies clustering to identify groups of similar misclassified instances between false positive and false negative predictions. These prototype instances are then further analysed using Break-down, a local explainer. Positive and negative feature contributions are studied for models trained with and without gender data, as well as using bias mitigation methods. The results show the potential of local explanations to understand different forms of gender bias in misclassification, which are not always related to a high feature contribution of the gender attribute.
Funding: GENIA project funded by the Annual Research Plan of University of Córdoba (UCOImpulsa mod., 2022). Grant PID2020-115832GB-I00 funded by MICIN/AEI/10.13039/501100011033. Andalusian Regional Government (postdoctoral grant DOC_00944).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alirezaie, M., Längkvist, M., Sioutis, M., Loutfi, A.: A symbolic approach for explaining errors in image classification tasks. In: Proceedings IJCAI-ECAI Workshop on Learning and Reasoning (2018)
Baniecki, H., Kretowicz, W., Piatyszek, P., Wisniewski, J., Biecek, P.: dalex: Responsible Machine Learning with Interactive Explainability and Fairness in Python. arXiv:2012.14406 (2020)
Bird, S., et al.: Fairlearn: A toolkit for assessing and improving fairness in AI. Tech. Rep. MSR-TR-2020-32, Microsoft (2020). https://www.microsoft.com/en-us/research/publication/fairlearn-a-toolkit-for-assessing-and-improving-fairness-in-ai/
Chen, Z., Zhang, J.M., Sarro, F., Harman, M.: A Comprehensive Empirical Study of Bias Mitigation Methods for Machine Learning Classifiers. ACM Trans. Softw. Eng. Methodol. 32(4) (2023). https://doi.org/10.1145/3583561
Cheng, M., De-Arteaga, M., Mackey, L., Kalai, A.T.: Social norm bias: residual harms of fairness-aware algorithms. Data Min. Knowl. Disc. (2023). https://doi.org/10.1007/s10618-022-00910-8
Cirillo, D., et al.: Sex and gender differences and biases in artificial intelligence for biomedicine and healthcare. npj Digital Medicine 3, 81 (2020). https://doi.org/10.1038/s41746-020-0288-5
Dwivedi, R., et al.: Explainable AI (XAI): core ideas, techniques, and solutions. ACM Comput. Surv. 55(9) (2023). https://doi.org/10.1145/3561048
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007). https://doi.org/10.1126/science.1136800
Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., Pedreschi, D.: A survey of methods for explaining black box models. ACM Comput. Surv. 51(5) (2018). https://doi.org/10.1145/3236009
Kim, B., Khanna, R., Koyejo, O.: Examples are not enough, learn to criticize! criticism for interpretability. In: Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS), pp. 2288–2296. Curran Associates Inc. (2016)
Kuratomi, A., Pitoura, E., Papapetrou, P., Lindgren, T., Tsaparas, P.: Measuring the Burden of (Un)fairness Using Counterfactuals. In: ECML PKDD International Workshop on eXplainable Knowledge Discovery in Data Mining, pp. 402–417. Springer (2022). https://doi.org/10.1007/978-3-031-23618-1_27
Le Quy, T., Roy, A., Iosifidis, V., Zhang, W., Ntoutsi, E.: A survey on datasets for fairness-aware machine learning. WIREs Data Min. Knowl. Discovery 12(3), e1452 (2022). https://doi.org/10.1002/widm.1452
Lucic, A., Haned, H., de Rijke, M.: Why does my model fail? contrastive local explanations for retail forecasting. In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAT*), pp. 90–98. ACM (2020). https://doi.org/10.1145/3351095.3372824
Manerba, M.M., Morini1, V.: Exposing racial dialect bias in abusive language detection: can explainability play a role? In: ECML PKDD International Workshop on eXplainable Knowledge Discovery in Data Mining, pp. 483–497. Springer (2022). https://doi.org/10.1007/978-3-031-23618-1_32
Manresa-Yee, C., Ramis Guarinos, S., Buades Rubio, J.M.: Facial expression recognition: impact of gender on fairness and expressions. In: Proceedings of the XXII International Conference on Human Computer Interaction. ACM (2022). https://doi.org/10.1145/3549865.3549904
Matt Kusner, Joshua Loftus, C.R., Silva, R.: Counterfactual fairness. In: 31st Conference on Neural Information Processing Systems (NIPS) (2017)
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A Survey on Bias and Fairness in Machine Learning. ACM Comput. Surv. 54(6) (2021). https://doi.org/10.1145/3457607
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Prates, M.O.R., Avelar, P.H., Lamb, L.C.: Assessing gender bias in machine translation: a case study with Google Translate. 32, 6363–6381 (2020). https://doi.org/10.1007/s00521-019-04144-6
Rizzi, W., Di Francescomarino, C., Maggi, F.M.: Explainability in predictive process monitoring: when understanding helps improving. In: Proc. International Conference on Business Process Management (BPM), pp. 141–158. Springer (2020). https://doi.org/10.1007/978-3-030-58638-6_9
Staniak, M., Biecek, P.: Explanations of Model Predictions with live and breakDown Packages. R J. 10(2), 395–409 (2018). https://doi.org/10.32614/RJ-2018-072
Z̆liobaitė, I.: On the relation between accuracy and fairness in binary classification. In: Proc. ICML Workhop on Fairness, Accountability, and Transparency in Machine Learning (2015)
Measuring discrimination in algorithmic decision making: Z̆liobaitė, I. Data Min. Knowl. Disc. 31, 1060–1089 (2017). https://doi.org/10.1007/s10618-017-0506-1
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ramírez, A. (2025). Exploring Gender Bias in Misclassification with Clustering and Local Explanations. In: Meo, R., Silvestri, F. (eds) Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2023. Communications in Computer and Information Science, vol 2135. Springer, Cham. https://doi.org/10.1007/978-3-031-74633-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-74633-8_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-74632-1
Online ISBN: 978-3-031-74633-8
eBook Packages: Artificial Intelligence (R0)