FARMUR: Fair Adversarial Retraining to Mitigate Unfairness in Robustness

Ali Mousavi, Seyed; Mousavi, Hamid; Daneshtalab, Masoud

doi:10.1007/978-3-031-42914-9_10

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13985))

Included in the following conference series:

European Conference on Advances in Databases and Information Systems

226 Accesses

Abstract

Deep Neural Networks (DNNs) have been deployed in safety-critical real-world applications, including automated decision-making systems. There are often concerns about two aspects of these systems: the fairness of the predictions and their robustness against adversarial attacks. In recent years, extensive studies have been devoted to addressing these issues independently through adversarial training and unfairness mitigation techniques. To consider fairness and robustness simultaneously, the robustness-bias concept is introduced, which means an attacker can more easily target a particular sub-partition in the dataset. However, there is no unified and mathematical definition for measuring fairness in the robustness of DNNs independent of the type of adversarial attacks. In this paper, we first provide a unified, precise, and mathematical theory and measurement for fairness in robustness independent of adversarial attacks for a DNN model. Finally, we proposed a fair adversarial retraining method (FARMUR) to mitigate unfairness in robustness that retrains the DNN models based on vulnerable and robust sub-partitions. In particular, FARMUR leverages different objective functions for vulnerable and robust sub-partitions to retrain the DNN. Experimental results demonstrate the effectiveness of FARMUR in mitigating the unfairness in robustness during adversarial training without significantly degrading robustness. FARMUR improves fairness in robustness by \(19.18\%\) with only \(2.22\%\) reduction in robustness in comparison with adversarial training on the UTKFace dataset, which is partitioned based on race attributes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Adel, T., Valera, I., Ghahramani, Z., Weller, A.: One-network adversarial fairness. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 2412–2420 (2019)
Google Scholar
Benz, P., Zhang, C., Karjauv, A., Kweon, I.S.: Robustness may be at odds with fairness: An empirical study on class-wise accuracy. In: NeurIPS 2020 Workshop on Pre-registration in Machine Learning, pp. 325–342. PMLR (2021)
Google Scholar
Beutel, A., et al.: Putting fairness principles into practice: challenges, metrics, and improvements. In: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pp. 453–459 (2019)
Google Scholar
Calmon, F., Wei, D., Vinzamuri, B., Natesan Ramamurthy, K., Varshney, K.R.: Optimized pre-processing for discrimination prevention. Adv. Neural Inf. Process. Syst. 30 (2017)
Google Scholar
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 39–57. IEEE (2017)
Google Scholar
Chouldechova, A.: Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big Data 5(2), 153–163 (2017)
Article Google Scholar
Donini, M., Oneto, L., Ben-David, S., Shawe-Taylor, J., Pontil, M.: Empirical risk minimization under fairness constraints. arXiv preprint arXiv:1802.08626 (2018)
Du, M., Liu, N., Yang, F., Hu, X.: Learning credible deep neural networks with rationale regularization. In: 2019 IEEE International Conference on Data Mining (ICDM), pp. 150–159. IEEE (2019)
Google Scholar
Du, M., Yang, F., Zou, N., Hu, X.: Fairness in deep learning: a computational perspective. IEEE Intell. Syst. 36(4), 25–34 (2020)
Article Google Scholar
Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pp. 214–226 (2012)
Google Scholar
Dwork, C., Ilvento, C.: Fairness under composition. arXiv preprint arXiv:1806.06122 (2018)
Geraeinejad, V., Sinaei, S., Modarressi, M., Daneshtalab, M.: Roco-nas: robust and compact neural architecture search. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021)
Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
Grgić-Hlača, N., Zafar, M.B., Gummadi, K.P., Weller, A.: Beyond distributive fairness in algorithmic decision making: feature selection for procedurally fair learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Google Scholar
Grigorescu, S., Trasnea, B., Cocias, T., Macesanu, G.: A survey of deep learning techniques for autonomous driving. J. Field Robot. 37(3), 362–386 (2020)
Article Google Scholar
Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. Adv. Neural Inf. Process. Syst. 29 (2016)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B., Madry, A.: Adversarial examples are not bugs, they are features. arXiv preprint arXiv:1905.02175 (2019)
Krizhevsky, A.: One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997 (2014)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Google Scholar
Leben, D.: Normative principles for evaluating fairness in machine learning. In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pp. 86–92 (2020)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
LeCun, Y., Haffner, P., Bottou, L., Bengio, Y.: Object recognition with gradient-based learning. In: Shape, Contour and Grouping in Computer Vision. LNCS, vol. 1681, pp. 319–345. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-46805-6_19
Liu, E.Z., et al.: Just train twice: improving group robustness without training group information. In: International Conference on Machine Learning, pp. 6781–6792. PMLR (2021)
Google Scholar
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017)
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. ACM Comput. Surv. (CSUR) 54(6), 1–35 (2021)
Article Google Scholar
Miotto, R., Wang, F., Wang, S., Jiang, X., Dudley, J.T.: Deep learning for healthcare: review, opportunities and challenges. Brief. Bioinform. 19(6), 1236–1246 (2018)
Article Google Scholar
Moosavi-Dezfooli, S.M., Fawzi, A., Frossard, P.: Deepfool: a simple and accurate method to fool deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2574–2582 (2016)
Google Scholar
Morgulis, N., Kreines, A., Mendelowitz, S., Weisglass, Y.: Fooling a real car with adversarial traffic signs. arXiv preprint arXiv:1907.00374 (2019)
Nanda, V., Dooley, S., Singla, S., Feizi, S., Dickerson, J.P.: Fairness through robustness: investigating robustness disparity in deep learning. In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pp. 466–477 (2021)
Google Scholar
Paden, B., Čáp, M., Yong, S.Z., Yershov, D., Frazzoli, E.: A survey of motion planning and control techniques for self-driving urban vehicles. IEEE Trans. Intell. Vehicles 1(1), 33–55 (2016)
Article Google Scholar
Papernot, N., McDaniel, P., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 582–597. IEEE (2016)
Google Scholar
Saha, D., Schumann, C., Mcelfresh, D., Dickerson, J., Mazurek, M., Tschantz, M.: Measuring non-expert comprehension of machine learning fairness metrics. In: International Conference on Machine Learning, pp. 8377–8387. PMLR (2020)
Google Scholar
Schumann, C., Foster, J., Mattei, N., Dickerson, J.: We need fairness and explainability in algorithmic hiring. In: International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) (2020)
Google Scholar
Speicher, T., et al.: Potential for discrimination in online targeted advertising. In: Conference on Fairness, Accountability and Transparency, pp. 5–19. PMLR (2018)
Google Scholar
Tian, Q., Kuang, K., Jiang, K., Wu, F., Wang, Y.: Analysis and applications of class-wise robustness in adversarial training. arXiv preprint arXiv:2105.14240 (2021)
Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., Madry, A.: Robustness may be at odds with accuracy. arXiv preprint arXiv:1805.12152 (2018)
Wang, J., Zhang, H.: Bilateral adversarial training: towards fast training of more robust models against adversarial attacks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6629–6638 (2019)
Google Scholar
Xie, C., Wang, J., Zhang, Z., Ren, Z., Yuille, A.: Mitigating adversarial effects through randomization. arXiv preprint arXiv:1711.01991 (2017)
Xu, H., Liu, X., Li, Y., Jain, A., Tang, J.: To be robust or to be fair: towards fairness in adversarial training. In: International Conference on Machine Learning, pp. 11492–11501. PMLR (2021)
Google Scholar
Yuan, X., He, P., Zhu, Q., Li, X.: Adversarial examples: attacks and defenses for deep learning. IEEE Trans. Neural Netw. Learn. Syst. 30(9), 2805–2824 (2019)
Article MathSciNet Google Scholar
Zemel, R., Wu, Y., Swersky, K., Pitassi, T., Dwork, C.: Learning fair representations. In: International Conference on Machine Learning, pp. 325–333. PMLR (2013)
Google Scholar
Zhang, D., Zhang, T., Lu, Y., Zhu, Z., Dong, B.: You only propagate once: accelerating adversarial training via maximal principle. arXiv preprint arXiv:1905.00877 (2019)
Zhang, H., Yu, Y., Jiao, J., Xing, E., El Ghaoui, L., Jordan, M.: Theoretically principled trade-off between robustness and accuracy. In: International Conference on Machine Learning, pp. 7472–7482. PMLR (2019)
Google Scholar
Zhang, Z., He, Q., Gao, J., Ni, M.: A deep learning approach for detecting traffic accidents from social media data. Transport. Res. Part C: Emerg. Technol. 86, 580–596 (2018)
Article Google Scholar
Zhang, Z., Song, Y., Qi, H.: Age progression/regression by conditional adversarial autoencoder. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5810–5818 (2017)
Google Scholar

Download references

Acknowledgement

This work was supported in part by the European Union through European Social Fund in the frames of the “Information and Communication Technologies (ICT) program” and by the Swedish Innovation Agency VINNOVA project “AutoDeep” and “SafeDeep”.” The computations were enabled by the supercomputing resource Berzelius provided by National Supercomputer Centre at Linköping University and the Knut and Alice Wallenberg foundation.

Author information

Authors and Affiliations

Shahid Bahonar University, Kerman, Iran
Seyed Ali Mousavi
Mälardalen University, Universitetsplan 1, 722 20, Västerås, Sweden
Hamid Mousavi & Masoud Daneshtalab
Tallinn University of Technology (Taltech), Tallinn, Akadeemia tee 15A, Estonia
Masoud Daneshtalab

Authors

Seyed Ali Mousavi
View author publications
You can also search for this author in PubMed Google Scholar
Hamid Mousavi
View author publications
You can also search for this author in PubMed Google Scholar
Masoud Daneshtalab
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hamid Mousavi .

Editor information

Editors and Affiliations

Universitat Politècnica de Catalunya, Barcelona, Spain
Alberto Abelló
University of Ioannina, Ioannina, Greece
Panos Vassiliadis
Universitat Politècnica de Catalunya, Barcelona, Spain
Oscar Romero
Poznan University of Technology, Poznan, Poland
Robert Wrembel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ali Mousavi, S., Mousavi, H., Daneshtalab, M. (2023). FARMUR: Fair Adversarial Retraining to Mitigate Unfairness in Robustness. In: Abelló, A., Vassiliadis, P., Romero, O., Wrembel, R. (eds) Advances in Databases and Information Systems. ADBIS 2023. Lecture Notes in Computer Science, vol 13985. Springer, Cham. https://doi.org/10.1007/978-3-031-42914-9_10

Download citation

DOI: https://doi.org/10.1007/978-3-031-42914-9_10
Published: 28 August 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-42913-2
Online ISBN: 978-3-031-42914-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

FARMUR: Fair Adversarial Retraining to Mitigate Unfairness in Robustness