Abstract
Natural language models and systems have been shown to reflect gender bias existing in training data. This bias can impact on the downstream task that machine learning models, built on this training data, are to accomplish. A variety of techniques have been proposed to mitigate gender bias in training data. In this paper we compare different gender bias mitigation approaches on a classification task. We consider mitigation techniques that manipulate the training data itself, including data scrubbing, gender swapping and counterfactual data augmentation approaches. We also look at using de-biased word embeddings in the representation of the training data. We evaluate the effectiveness of the different approaches at reducing the gender bias in the training data and consider the impact on task performance. Our results show that the performance of the classification task is not affected adversely by many of the bias mitigation techniques but we show a significant variation in the effectiveness of the different gender bias mitigation techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Badilla, P., Bravo-Marquez, F., Pérez, J.: Wefe: the word embeddings fairness evaluation framework. In: Proceedings of IJCAI (2020)
Blodgett, S.L., et al.: Stereotyping Norwegian salmon: An inventory of pitfalls in fairness. In: Proceedings of ACL (2021)
Bolukbasi, T., et al.: Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In: Advances in NeurIPS (2016)
Caliskan, A., Bryson, J.J., Narayanan, A.: Semantics derived automatically from language corpora contain human-like biases. Science (2017)
Cao, Y.T., et al.: Toward gender-inclusive coref. resolution. In: Proceedings of ACL (2020)
De-Arteaga, M., othersRomanov, A., Wallach, H., et al.: Bias in bios: a case study of semantic representation bias in a high-stakes setting. In: Proceedings of FAT* (2019)
Dixon, L., Li, J., Sorensen, J., Thain, N., Vasserman, L.: Measuring and mitigating unintended bias in text classification. In: Proceedings of AAAI/ACM Conference on AIES (2018)
Gonen, H., et al.: Lipstick on a pig: debiasing methods cover up systematic gender biases in word embeddings but do not remove them. In: Proceedings of NAACL (2019)
Hall Maudslay, R., et al.: It’s all in the name: mitigating gender bias with name-based counterfactual data substitution. In: Proceedings of EMNLP-IJCNLP (2019)
Hardt, M., et al.: Equality of opportunity in supervised learning. NIPS (2016)
Kiritchenko, S., Mohammad, S.: Examining gender and race bias in two hundred sentiment analysis systems. In: Proceedings of Conference on SEM (2018)
Kurita, K., Vyas, N., Pareek, A., et al.: Measuring bias in contextualized word representations. In: Proceedings of 1st workshop on Gender Bias in NLP (2019)
Lu, K., et al.: Gender Bias in Neural Natural Language Processing. arXiv (2018)
Mikolov, T., Sutskever, I., Chen, K., et al.: Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS (2013)
Nadeem, M., Bethke, A., Reddy, S.: StereoSet: Measuring stereotypical bias in pretrained language models. In: Proceedings of ACL (2021)
Park, J.H., et al.: Reducing gender bias in abusive language. In: EMNLP (2018)
Prost, F., Thain, N., Bolukbasi, T.: Debiasing embeddings for reduced gender bias in text classification. In: Proceedings of the 1st Workshop on Gender Bias in NLP (2019)
Rudinger, R., et al.: Social bias in elicited nli. In: Proceedings of ACL on Ethics (2017)
Speer, R., et al.: An open multilingual graph of general knowledge. In: AAAI (2017)
Stanczak, K., et al.: A survey on gender bias in nlp. arXiv preprint (2021)
Sun, T., et al.: Mitigating gender bias in nlp: Lit. review. In: Proceedings of ACL (2019)
Verma, S., et al.: Fairness definitions explained. In: Proceedings of Software Fairness (2018)
Waseem, Z., et al.: Hateful symbols or hateful people? predictive features for hate speech detection on Twitter. In: Proceedings of NAACL (2016)
Webster, K., Recasens, M., Axelrod, V., Baldridge, J.: Mind the GAP: a balanced corpus of gendered ambiguous pronouns. Trans. ACL (2018)
Zhao, J., Wang, T., Yatskar, M., Ordonez, V., Chang, K.W.: Gender bias in coreference resolution: Evaluation and debiasing methods. In: Proceedings of NAACL (2018)
Zhao, J., et al.: Learning gender-neutral word embeddings. In: EMNLP (2018)
Acknowledgements
This publication has emanated from research conducted with the financial support of Science Foundation Ireland under Grant number 18/CRT/6183. For the purpose of Open Access, the author has applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Sobhani, N., Delany, S.J. (2022). Exploring the Impact of Gender Bias Mitigation Approaches on a Downstream Classification Task. In: Ceci, M., Flesca, S., Masciari, E., Manco, G., Raś, Z.W. (eds) Foundations of Intelligent Systems. ISMIS 2022. Lecture Notes in Computer Science(), vol 13515. Springer, Cham. https://doi.org/10.1007/978-3-031-16564-1_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-16564-1_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16563-4
Online ISBN: 978-3-031-16564-1
eBook Packages: Computer ScienceComputer Science (R0)