Abstract
To avoid deep learning models utilizing shortcuts in a training dataset, many debiasing models have been developed to encourage models learning from accurate correlations. Some research constructs robust models via adversarial training. Although this series of methods shows promising debiasing performance, we do not know precisely what spurious features have been discarded during adversarial training. To address its lack of explainability especially in scenarios with low error tolerance, we design AdvExp, which not only visualizes the underlying spurious feature behind adversarial training but also maintains good debiasing performance with the assistance of a robust optimization algorithm. We show promising performance of AdvExp on BiasCheXpert, a subsampled dataset from CheXpert, and uncover potential regions in radiographs recognized by deep neural networks as gender or race-related features.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Larrazabal, A.J., Nieto, N., Peterson, V., Milone, D.H., Ferrante, E.: Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis. In: Proceedings of the National Academy of Sciences, vol. 117, no. 23, pp. 12592–12594 (2020)
Gichoya, J.W., et al.: Ai recognition of patient race in medical imaging: a modelling study. Lancet Digit. Health 4(6), e406–e414 (2022)
Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation. In: International Conference on Machine Learning, pp. 1180–1189. PMLR (2015)
Zhang, B.H., Lemoine, B., Mitchell, M.: Mitigating unwanted biases with adversarial learning. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 335–340 (2018)
Kim, B., Kim, H., Kim, K., Kim, S., Kim, J.: Learning not to learn: training deep neural networks with biased data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9012–9020 (2019)
Du, S., Hers, B., Bayasi, N., Hamarneh, G., Garbi, R.: FairDisCo: fairer AI in dermatology via disentanglement contrastive learning. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds.) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol. 13804, pp. 185–202. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-25069-9_13
Bahng, H., Chun, S., Yun, S., Choo, J., Oh, S.J.: Learning de-biased representations with biased representations. In: International Conference on Machine Learning, pp. 528–539. PMLR (2020)
Wang, Z., et al.: Fairness-aware adversarial perturbation towards bias mitigation for deployed deep models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10379–10388 (2022)
Kehrenberg, T., Bartlett, M., Thomas, O., Quadrianto, N.: Null-sampling for interpretable and fair representations. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12371, pp. 565–580. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58574-7_34
Singla, S., Feizi, S.: Salient ImageNet: how to discover spurious features in deep learning? arXiv preprint arXiv:2110.04301 (2021)
Wang, T., Zhao, J., Yatskar, M., Chang, K.-W., Ordonez, V.: Balanced datasets are not enough: estimating and mitigating gender bias in deep image representations. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5310–5319 (2019)
Sagawa, S., Koh, P.W., Hashimoto, T.B., Liang, P.: Distributionally robust neural networks for group shifts: on the importance of regularization for worst-case generalization. arXiv preprint arXiv:1911.08731 (2019)
Kamiran, F., Calders, T.: Data preprocessing techniques for classification without discrimination. Knowl. Inf. Syst. 33(1), 1–33 (2012)
Irvin, J., et al.: CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, pp. 590–597 (2019)
Du, M., Yang, F., Zou, N., Hu, X.: Fairness in deep learning: a computational perspective. IEEE Intell. Syst. 36(4), 25–34 (2020)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, CY., Ching, P., Huang, PH., Hu, MC. (2024). Where Are Biases? Adversarial Debiasing with Spurious Feature Visualization. In: Rudinac, S., et al. MultiMedia Modeling. MMM 2024. Lecture Notes in Computer Science, vol 14554. Springer, Cham. https://doi.org/10.1007/978-3-031-53305-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-53305-1_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53304-4
Online ISBN: 978-3-031-53305-1
eBook Packages: Computer ScienceComputer Science (R0)