Distance Correlation GAN: Fair Tabular Data Generation with Generative Adversarial Networks

Rajabi, Amirarsalan; Garibay, Ozlem Ozmen

doi:10.1007/978-3-031-35891-3_26

Amirarsalan Rajabi⁹ &
Ozlem Ozmen Garibay^9,10

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14050))

Included in the following conference series:

International Conference on Human-Computer Interaction

1399 Accesses

Abstract

With the growing impact of artificial intelligence, the topic of fairness in AI has received increasing attention for valid reasons. In this paper, we propose a generative adversarial network for fair tabular data generation. The model is a WGAN, where the generator is enforcing fairness by penalizing distance correlation between protected attribute and target attribute. We compare our results with another state-of-the-art generative adversarial network for fair tabular data generation and a preprocessing repairment method on four datasets, and show that our model is able to produce synthetic data, such that training a classifier on it results in a fair classifier, beating the other two methods. This makes the model suitable for applications that concern with fairness and preserving privacy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abdollahpouri, H., Burke, R., Mobasher, B.: Managing popularity bias in recommender systems with personalized re-ranking. In: The thirty-Second International Flairs Conference (2019)
Google Scholar
Alves, G., Amblard, M., Bernier, F., Couceiro, M., Napoli, A.: Reducing unintended bias of ml models on tabular and textual data. In: 2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA), pp. 1–10. IEEE (2021)
Google Scholar
Angwin, J., Larson, J., Mattu, S., Kirchner, L.: Machine bias propublica (2016)
Google Scholar
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: International Conference on Machine Learning, pp. 214–223. PMLR (2017)
Google Scholar
Mark Beasley, T., Erickson, S., Allison, D.B.: Rank-based inverse normal transformations are increasingly used, but are they merited? Beh. Genet. 39(5), 580–595 (2009)
Google Scholar
Benesty, J., Chen, J., Huang, Y., Cohen, I.: Pearson correlation coefficient. In: Noise Reduction in Speech Processing, pp. 1–4. Springer, Vienna (2009). https://doi.org/10.1007/978-3-211-89836-9_1025
Beutel, A., Chen, J., Zhao, Z., Chi, E.H.; Data decisions and theoretical implications when adversarially learning fair representations. arXiv preprint arXiv:1707.00075 (2017)
Binns, R.: Fairness in machine learning: Lessons from political philosophy. In: Conference on Fairness, Accountability and Transparency, pp. 149–159. PMLR (2018)
Google Scholar
Bolukbasi, T., Chang, K.-W., Zou, J.Y., Saligrama, V., Kalai, A.T.: Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Google Scholar
Bordia, S., Bowman, S.R.: Identifying and reducing gender bias in word-level language models. arXiv preprint arXiv:1904.03035 (2019)
Brock, A., Donahue, J., Simonyan, K.; Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018)
Chang, K.-W., Prabhakaran, V., Ordonez, V.: Bias and fairness in natural language processing. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): Tutorial Abstracts (2019)
Google Scholar
Chen, J., Dong, H., Wang, X., Feng, F., Wang, M., He, X.: Bias and debias in recommender system: A survey and future directions. arXiv preprint arXiv:2010.03240 (2020)
Chouldechova, A.: Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big Data 5(2), 153–163 (2017)
Article Google Scholar
Dua, D., Graff, C.: UCI machine learning repository (2017)
Google Scholar
Edelmann, D., Móri, T.F., Székely, G.J.: On relationships between the Pearson and the distance correlation coefficients. Stat. Probabil. Lett. 169, 108960 (2021)
Google Scholar
Feldman, M., Friedler, S.A., Moeller, J., Scheidegger, C., Venkatasubramanian,S.: Certifying and removing disparate impact. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 259–268 (2015)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, vol. 27 (2014)
Google Scholar
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of wasserstein GANs. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates Inc (2017)
Google Scholar
Gupta, U., Ferber, A.M., Dilkina, B., Ver Steeg, G.: Controllable guarantees for fair outcomes via contrastive information estimation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 7610–7619 (2021)
Google Scholar
Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. In: Advances in Neural Information Processing Systems, vol. 29, pp. 3315–3323 (2016)
Google Scholar
Jenssen, R.: An information theoretic approach to machine learning. Doctor Scientiarum thesis, Department of Physics, Faculty of Science, University of Tromsø (2005)
Google Scholar
Kamiran, F., Calders, T.: Data preprocessing techniques for classification without discrimination. Knowl. Inf. Syst. 33(1), 1–33 (2012)
Article Google Scholar
Kamishima, T., Akaho, S., Asoh, H., Sakuma, J.: Fairness-aware classifier with prejudice remover regularizer. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012. LNCS (LNAI), vol. 7524, pp. 35–50. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33486-3_3
Chapter Google Scholar
Khosla, A., Zhou, T., Malisiewicz, T., Efros, A.A., Torralba, A.: Undoing the damage of dataset bias. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 158–171. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33718-5_12
Chapter Google Scholar
Kraskov, A., Stögbauer, H., Grassberger, P.: Estimating mutual information. Physical Rev. E 69(6), 066138 (2004)
Article MathSciNet Google Scholar
Krishnan, S., Patel, J., Franklin, M.J., Goldberg, K.: A methodology for learning, analyzing, and mitigating social influence bias in recommender systems. In: Proceedings of the 8th ACM Conference on Recommender Systems, pp. 137–144 (2014)
Google Scholar
Lambrecht, A., Tucker, C.: Algorithmic bias? an empirical study of apparent gender-based discrimination in the display of stem career ads. Manage. Sci. 65(7), 2966–2981 (2019)
Article Google Scholar
Lee, N.T.: Detecting racial bias in algorithms and machine learning. J. Inf. Commun. Ethics Soc. (2018)
Google Scholar
Lepri, B., Oliver, N., Letouzé, E., Pentland, A., Vinck, P.: Fair, transparent, and accountable algorithmic decision-making processes. Philos. Technol. 31(4), 611–627 (2018)
Article Google Scholar
Mehrabi, N., Gupta, U., Morstatter, F., Ver Steeg, G., Galstyan, A.: Attributing fair decisions with attention interventions. arXiv preprint arXiv:2109.03952 (2021)
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. ACM Comput. Surv. (CSUR) 54(6), 1–35 (2021)
Article Google Scholar
Moro, S., Cortez, P., Rita, P.: A data-driven approach to predict the success of bank telemarketing. Decis. Support Syst. 62, 22–31 (2014)
Article Google Scholar
Moyer, D., Gao, S., Brekelmans, R., Galstyan, A., Ver Steeg, G.: Invariant representations without adversarial training. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Google Scholar
Park, N., Mohammadi, M., Gorde, K., Jajodia, S., Park, H., Kim, Y.: Data synthesis based on generative adversarial networks. arXiv preprint arXiv:1806.03384 (2018)
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Pessach, D., Shmueli, E.: Algorithmic fairness. arXiv preprint arXiv:2001.09784 (2020)
Piccoli, B., Rossi, F.: Generalized wasserstein distance and its application to transport equations with source. Arch. Ration. Mech. Anal. 211(1), 335–358 (2014)
Article MathSciNet MATH Google Scholar
Amirarsalan Rajabi and Ozlem Ozmen Garibay: Tabfairgan: fair tabular data generation with generative adversarial networks. Mach. Learn. Knowl. Extract. 4(2), 488–501 (2022)
Article Google Scholar
Ramaswamy, V.V., Kim, S.S.Y., Russakovsky, O.: Fair attribute classification through latent space de-biasing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9301–9310 (2021)
Google Scholar
Robinson, J.P., Livitz, G., Henon, Y., Qin, C., Fu, Y., Timoner, S.: Face recognition: too bias, or not too bias? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 0–1 (2020)
Google Scholar
Sahlgren, M., Olsson, F.: Gender bias in pretrained Swedish embeddings. In: Proceedings of the 22nd Nordic Conference on Computational Linguistics, pp. 35–43 (2019)
Google Scholar
Sattigeri, P., Hoffman, S.C., Chenthamarakshan, V., Varshney, K.R.: Fairness GAN. arXiv preprint arXiv:1805.09910 (2018)
Sattigeri, P., Hoffman, S.C., Chenthamarakshan, V., Varshney, K.R.: Fairness GAN: generating datasets with fairness properties using a generative adversarial network. IBM J. Res. Dev. 63(4/5), 3–1 (2019)
Google Scholar
Székely, G.J., Rizzo, M.L., Bakirov, N.K.: Measuring and testing dependence by correlation of distances. Ann. Stat. 35(6), 2769–2794 (2007)
Google Scholar
Tolan, S., Miron, M., Gómez, E., Castillo, C.: Why machine learning may lead to unfairness: evidence from risk assessment for juvenile justice in Catalonia. In: Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law, pp. 83–92 (2019)
Google Scholar
Verma, S., Rubin, J.: Fairness definitions explained. In: 2018 IEEE/ACM International Workshop on Software Fairness (fairware), pp. 1–7. IEEE (2018)
Google Scholar
Wang, Z., et al.: Towards fairness in visual recognition: Effective strategies for bias mitigation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8919–8928 (2020)
Google Scholar
Wightman, L.F.: LSAC national longitudinal bar passage study. LSAC Research Report Series (1998)
Google Scholar
Xu, D., Yuan, S., Zhang, L., Wu, X.: Fairgan: fairness-aware generative adversarial networks. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 570–575. IEEE (2018)
Google Scholar
Zhao, J., Wang, T., Yatskar, M., Ordonez, V., Chang, K.-W.: Men also like shopping: reducing gender bias amplification using corpus-level constraints. arXiv preprint arXiv:1707.09457 (2017)

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Central Florida, Orlando, FL, 32816, USA
Amirarsalan Rajabi & Ozlem Ozmen Garibay
Department of Industrial Engineering and Management Systems, Orlando, FL, 32816, USA
Ozlem Ozmen Garibay

Authors

Amirarsalan Rajabi
View author publications
You can also search for this author in PubMed Google Scholar
Ozlem Ozmen Garibay
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ozlem Ozmen Garibay .

Editor information

Editors and Affiliations

Siemens Corporation, Princeton, NJ, USA
Helmut Degen
Foundation for Research and Technology – Hellas (FORTH), Heraklion, Crete, Greece
Stavroula Ntoa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rajabi, A., Garibay, O.O. (2023). Distance Correlation GAN: Fair Tabular Data Generation with Generative Adversarial Networks. In: Degen, H., Ntoa, S. (eds) Artificial Intelligence in HCI. HCII 2023. Lecture Notes in Computer Science(), vol 14050. Springer, Cham. https://doi.org/10.1007/978-3-031-35891-3_26

Download citation

DOI: https://doi.org/10.1007/978-3-031-35891-3_26
Published: 09 July 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-35890-6
Online ISBN: 978-3-031-35891-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Distance Correlation GAN: Fair Tabular Data Generation with Generative Adversarial Networks