Skip to main content

Distance Correlation GAN: Fair Tabular Data Generation with Generative Adversarial Networks

  • Conference paper
  • First Online:
Artificial Intelligence in HCI (HCII 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14050))

Included in the following conference series:

  • 1399 Accesses

Abstract

With the growing impact of artificial intelligence, the topic of fairness in AI has received increasing attention for valid reasons. In this paper, we propose a generative adversarial network for fair tabular data generation. The model is a WGAN, where the generator is enforcing fairness by penalizing distance correlation between protected attribute and target attribute. We compare our results with another state-of-the-art generative adversarial network for fair tabular data generation and a preprocessing repairment method on four datasets, and show that our model is able to produce synthetic data, such that training a classifier on it results in a fair classifier, beating the other two methods. This makes the model suitable for applications that concern with fairness and preserving privacy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abdollahpouri, H., Burke, R., Mobasher, B.: Managing popularity bias in recommender systems with personalized re-ranking. In: The thirty-Second International Flairs Conference (2019)

    Google Scholar 

  2. Alves, G., Amblard, M., Bernier, F., Couceiro, M., Napoli, A.: Reducing unintended bias of ml models on tabular and textual data. In: 2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA), pp. 1–10. IEEE (2021)

    Google Scholar 

  3. Angwin, J., Larson, J., Mattu, S., Kirchner, L.: Machine bias propublica (2016)

    Google Scholar 

  4. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: International Conference on Machine Learning, pp. 214–223. PMLR (2017)

    Google Scholar 

  5. Mark Beasley, T., Erickson, S., Allison, D.B.: Rank-based inverse normal transformations are increasingly used, but are they merited? Beh. Genet. 39(5), 580–595 (2009)

    Google Scholar 

  6. Benesty, J., Chen, J., Huang, Y., Cohen, I.: Pearson correlation coefficient. In: Noise Reduction in Speech Processing, pp. 1–4. Springer, Vienna (2009). https://doi.org/10.1007/978-3-211-89836-9_1025

  7. Beutel, A., Chen, J., Zhao, Z., Chi, E.H.; Data decisions and theoretical implications when adversarially learning fair representations. arXiv preprint arXiv:1707.00075 (2017)

  8. Binns, R.: Fairness in machine learning: Lessons from political philosophy. In: Conference on Fairness, Accountability and Transparency, pp. 149–159. PMLR (2018)

    Google Scholar 

  9. Bolukbasi, T., Chang, K.-W., Zou, J.Y., Saligrama, V., Kalai, A.T.: Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In: Advances in Neural Information Processing Systems, vol. 29 (2016)

    Google Scholar 

  10. Bordia, S., Bowman, S.R.: Identifying and reducing gender bias in word-level language models. arXiv preprint arXiv:1904.03035 (2019)

  11. Brock, A., Donahue, J., Simonyan, K.; Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018)

  12. Chang, K.-W., Prabhakaran, V., Ordonez, V.: Bias and fairness in natural language processing. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): Tutorial Abstracts (2019)

    Google Scholar 

  13. Chen, J., Dong, H., Wang, X., Feng, F., Wang, M., He, X.: Bias and debias in recommender system: A survey and future directions. arXiv preprint arXiv:2010.03240 (2020)

  14. Chouldechova, A.: Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big Data 5(2), 153–163 (2017)

    Article  Google Scholar 

  15. Dua, D., Graff, C.: UCI machine learning repository (2017)

    Google Scholar 

  16. Edelmann, D., Móri, T.F., Székely, G.J.: On relationships between the Pearson and the distance correlation coefficients. Stat. Probabil. Lett. 169, 108960 (2021)

    Google Scholar 

  17. Feldman, M., Friedler, S.A., Moeller, J., Scheidegger, C., Venkatasubramanian,S.: Certifying and removing disparate impact. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 259–268 (2015)

    Google Scholar 

  18. Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, vol. 27 (2014)

    Google Scholar 

  19. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of wasserstein GANs. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates Inc (2017)

    Google Scholar 

  20. Gupta, U., Ferber, A.M., Dilkina, B., Ver Steeg, G.: Controllable guarantees for fair outcomes via contrastive information estimation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 7610–7619 (2021)

    Google Scholar 

  21. Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. In: Advances in Neural Information Processing Systems, vol. 29, pp. 3315–3323 (2016)

    Google Scholar 

  22. Jenssen, R.: An information theoretic approach to machine learning. Doctor Scientiarum thesis, Department of Physics, Faculty of Science, University of Tromsø (2005)

    Google Scholar 

  23. Kamiran, F., Calders, T.: Data preprocessing techniques for classification without discrimination. Knowl. Inf. Syst. 33(1), 1–33 (2012)

    Article  Google Scholar 

  24. Kamishima, T., Akaho, S., Asoh, H., Sakuma, J.: Fairness-aware classifier with prejudice remover regularizer. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012. LNCS (LNAI), vol. 7524, pp. 35–50. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33486-3_3

    Chapter  Google Scholar 

  25. Khosla, A., Zhou, T., Malisiewicz, T., Efros, A.A., Torralba, A.: Undoing the damage of dataset bias. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 158–171. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33718-5_12

    Chapter  Google Scholar 

  26. Kraskov, A., Stögbauer, H., Grassberger, P.: Estimating mutual information. Physical Rev. E 69(6), 066138 (2004)

    Article  MathSciNet  Google Scholar 

  27. Krishnan, S., Patel, J., Franklin, M.J., Goldberg, K.: A methodology for learning, analyzing, and mitigating social influence bias in recommender systems. In: Proceedings of the 8th ACM Conference on Recommender Systems, pp. 137–144 (2014)

    Google Scholar 

  28. Lambrecht, A., Tucker, C.: Algorithmic bias? an empirical study of apparent gender-based discrimination in the display of stem career ads. Manage. Sci. 65(7), 2966–2981 (2019)

    Article  Google Scholar 

  29. Lee, N.T.: Detecting racial bias in algorithms and machine learning. J. Inf. Commun. Ethics Soc. (2018)

    Google Scholar 

  30. Lepri, B., Oliver, N., Letouzé, E., Pentland, A., Vinck, P.: Fair, transparent, and accountable algorithmic decision-making processes. Philos. Technol. 31(4), 611–627 (2018)

    Article  Google Scholar 

  31. Mehrabi, N., Gupta, U., Morstatter, F., Ver Steeg, G., Galstyan, A.: Attributing fair decisions with attention interventions. arXiv preprint arXiv:2109.03952 (2021)

  32. Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. ACM Comput. Surv. (CSUR) 54(6), 1–35 (2021)

    Article  Google Scholar 

  33. Moro, S., Cortez, P., Rita, P.: A data-driven approach to predict the success of bank telemarketing. Decis. Support Syst. 62, 22–31 (2014)

    Article  Google Scholar 

  34. Moyer, D., Gao, S., Brekelmans, R., Galstyan, A., Ver Steeg, G.: Invariant representations without adversarial training. In: Advances in Neural Information Processing Systems, vol. 31 (2018)

    Google Scholar 

  35. Park, N., Mohammadi, M., Gorde, K., Jajodia, S., Park, H., Kim, Y.: Data synthesis based on generative adversarial networks. arXiv preprint arXiv:1806.03384 (2018)

  36. Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  37. Pessach, D., Shmueli, E.: Algorithmic fairness. arXiv preprint arXiv:2001.09784 (2020)

  38. Piccoli, B., Rossi, F.: Generalized wasserstein distance and its application to transport equations with source. Arch. Ration. Mech. Anal. 211(1), 335–358 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  39. Amirarsalan Rajabi and Ozlem Ozmen Garibay: Tabfairgan: fair tabular data generation with generative adversarial networks. Mach. Learn. Knowl. Extract. 4(2), 488–501 (2022)

    Article  Google Scholar 

  40. Ramaswamy, V.V., Kim, S.S.Y., Russakovsky, O.: Fair attribute classification through latent space de-biasing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9301–9310 (2021)

    Google Scholar 

  41. Robinson, J.P., Livitz, G., Henon, Y., Qin, C., Fu, Y., Timoner, S.: Face recognition: too bias, or not too bias? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 0–1 (2020)

    Google Scholar 

  42. Sahlgren, M., Olsson, F.: Gender bias in pretrained Swedish embeddings. In: Proceedings of the 22nd Nordic Conference on Computational Linguistics, pp. 35–43 (2019)

    Google Scholar 

  43. Sattigeri, P., Hoffman, S.C., Chenthamarakshan, V., Varshney, K.R.: Fairness GAN. arXiv preprint arXiv:1805.09910 (2018)

  44. Sattigeri, P., Hoffman, S.C., Chenthamarakshan, V., Varshney, K.R.: Fairness GAN: generating datasets with fairness properties using a generative adversarial network. IBM J. Res. Dev. 63(4/5), 3–1 (2019)

    Google Scholar 

  45. Székely, G.J., Rizzo, M.L., Bakirov, N.K.: Measuring and testing dependence by correlation of distances. Ann. Stat. 35(6), 2769–2794 (2007)

    Google Scholar 

  46. Tolan, S., Miron, M., Gómez, E., Castillo, C.: Why machine learning may lead to unfairness: evidence from risk assessment for juvenile justice in Catalonia. In: Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law, pp. 83–92 (2019)

    Google Scholar 

  47. Verma, S., Rubin, J.: Fairness definitions explained. In: 2018 IEEE/ACM International Workshop on Software Fairness (fairware), pp. 1–7. IEEE (2018)

    Google Scholar 

  48. Wang, Z., et al.: Towards fairness in visual recognition: Effective strategies for bias mitigation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8919–8928 (2020)

    Google Scholar 

  49. Wightman, L.F.: LSAC national longitudinal bar passage study. LSAC Research Report Series (1998)

    Google Scholar 

  50. Xu, D., Yuan, S., Zhang, L., Wu, X.: Fairgan: fairness-aware generative adversarial networks. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 570–575. IEEE (2018)

    Google Scholar 

  51. Zhao, J., Wang, T., Yatskar, M., Ordonez, V., Chang, K.-W.: Men also like shopping: reducing gender bias amplification using corpus-level constraints. arXiv preprint arXiv:1707.09457 (2017)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ozlem Ozmen Garibay .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rajabi, A., Garibay, O.O. (2023). Distance Correlation GAN: Fair Tabular Data Generation with Generative Adversarial Networks. In: Degen, H., Ntoa, S. (eds) Artificial Intelligence in HCI. HCII 2023. Lecture Notes in Computer Science(), vol 14050. Springer, Cham. https://doi.org/10.1007/978-3-031-35891-3_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-35891-3_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-35890-6

  • Online ISBN: 978-3-031-35891-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics