Shape Prior is Not All You Need: Discovering Balance Between Texture and Shape Bias in CNN

Chung, Hyunhee; Park, Kyung Ho

doi:10.1007/978-3-031-26284-5_30

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13842))

Included in the following conference series:

Asian Conference on Computer Vision

291 Accesses
1 Citations

Abstract

As Convolutional Neural Network (CNN) trained under ImageNet is known to be biased in image texture rather than object shapes, recent works proposed that elevating shape awareness of the CNNs makes them similar to human visual recognition. However, beyond the ImageNet-trained CNN, how can we make CNNs similar to human vision in the wild? In this paper, we present a series of analyses to answer this question. First, we propose AdaBA, a novel method of quantitatively illustrating CNN’s shape and texture bias by resolving several limits of the prior method. With the proposed AdaBA, we focused on fine-tuned CNN’s bias landscape which previous studies have not dealt with. We discover that fine-tuned CNNs are also biased to texture, but their bias strengths differ along with the downstream dataset; thus, we presume a data distribution is a root cause of texture bias exists. To tackle this root cause, we propose a granular labeling scheme, a simple but effective solution that redesigns the label space to pursue a balance between texture and shape biases. We empirically examine that the proposed scheme escalates CNN’s classification and OOD detection performance. We expect key findings and proposed methods in the study to elevate understanding of the CNN and yield an effective solution to mitigate this texture bias.

H. Chung and K. H. Park—Contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Adebayo, J., Gilmer, J., Muelly, M., Goodfellow, I., Hardt, M., Kim, B.: Sanity checks for saliency maps. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Google Scholar
Cheng, B., Wei, Y., Shi, H., Feris, R., Xiong, J., Huang, T.: Revisiting RCNN: on awakening the classification power of faster RCNN. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11219, pp. 473–490. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01267-0_28
Chapter Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Google Scholar
Dodge, S., Karam, L.: A study and comparison of human and deep learning recognition performance under visual distortions. In: 2017 26th International Conference on Computer Communication and Networks (ICCCN), pp. 1–7. IEEE (2017)
Google Scholar
Doshi-Velez, F., Kim, B.: Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017)
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016)
Google Scholar
Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A., Brendel, W.: ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv preprint arXiv:1811.12231 (2018)
Ghorbani, A., Wexler, J., Zou, J.Y., Kim, B.: Towards automatic concept-based explanations. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Gu, J., et al.: Recent advances in convolutional neural networks. Pattern Recogn. 77, 354–377 (2018)
Article Google Scholar
Guo, C., Wu, D.: Canonical correlation analysis (CCA) based multi-view learning: an overview. arXiv preprint arXiv:1907.01693 (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hendrycks, D., Gimpel, K.: A baseline for detecting misclassified and out-of-distribution examples in neural networks. arXiv preprint arXiv:1610.02136 (2016)
Hendrycks, D., Mazeika, M., Dietterich, T.: Deep anomaly detection with outlier exposure. arXiv preprint arXiv:1812.04606 (2018)
Hermann, K., Chen, T., Kornblith, S.: The origins and prevalence of texture bias in convolutional neural networks. Adv. Neural. Inf. Process. Syst. 33, 19000–19015 (2020)
Google Scholar
Islam, M.A., et al.: Shape or texture: understanding discriminative features in CNNs. arXiv preprint arXiv:2101.11604 (2021)
Kim, B., Wattenberg, M., Gilmer, J., Cai, C., Wexler, J., Viegas, F., et al.: Interpretability beyond feature attribution: quantitative testing with concept activation vectors (TCAV). In: International Conference on Machine Learning, pp. 2668–2677. PMLR (2018)
Google Scholar
Kornblith, S., Norouzi, M., Lee, H., Hinton, G.: Similarity of neural network representations revisited. In: International Conference on Machine Learning, pp. 3519–3529. PMLR (2019)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012)
Google Scholar
Lee, K., Lee, H., Lee, K., Shin, J.: Training confidence-calibrated classifiers for detecting out-of-distribution samples. arXiv preprint arXiv:1711.09325 (2017)
Li, J., et al.: A systematic collection of medical image datasets for deep learning. arXiv preprint arXiv:2106.12864 (2021)
Papageorgiou, C., Poggio, T.: A trainable system for object detection. Int. J. Comput. Vision 38(1), 15–33 (2000)
Article MATH Google Scholar
Shen, D., Wu, G., Suk, H.I.: Deep learning in medical image analysis. Annu. Rev. Biomed. Eng. 19, 221–248 (2017)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Sun, Q.S., Zeng, S.G., Liu, Y., Heng, P.A., Xia, D.S.: A new method of feature fusion and its application in image recognition. Pattern Recogn. 38(12), 2437–2448 (2005)
Article Google Scholar
Tuli, S., Dasgupta, I., Grant, E., Griffiths, T.L.: Are convolutional neural networks or transformers more like human vision? arXiv preprint arXiv:2105.07197 (2021)
Vaze, S., Han, K., Vedaldi, A., Zisserman, A.: Open-set recognition: a good closed-set classifier is all you need. arXiv preprint arXiv:2110.06207 (2021)
Wu, Z., Shen, C., Van Den Hengel, A.: Wider or deeper: revisiting the ResNet model for visual recognition. Pattern Recogn. 90, 119–133 (2019)
Article Google Scholar

Download references

Acknowledgement

This work is supported by the Korea Agency for Infrastructure Technology Advancement grant funded by the Ministry of Land, Infrastructure, and Transport (Grant RS-2022-0014579).

Author information

Authors and Affiliations

SOCAR AI Research, Seoul, Korea
Hyunhee Chung & Kyung Ho Park

Authors

Hyunhee Chung
View author publications
You can also search for this author in PubMed Google Scholar
Kyung Ho Park
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kyung Ho Park .

Editor information

Editors and Affiliations

University of Wollongong, Wollongong, NSW, Australia
Lei Wang
University of Bonn, Bonn, Germany
Juergen Gall
University of Adelaide, Adelaide, SA, Australia
Tat-Jun Chin
National Institute of Informatics, Tokyo, Japan
Imari Sato
Johns Hopkins University, Baltimore, MD, USA
Rama Chellappa

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 3867 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chung, H., Park, K.H. (2023). Shape Prior is Not All You Need: Discovering Balance Between Texture and Shape Bias in CNN. In: Wang, L., Gall, J., Chin, TJ., Sato, I., Chellappa, R. (eds) Computer Vision – ACCV 2022. ACCV 2022. Lecture Notes in Computer Science, vol 13842. Springer, Cham. https://doi.org/10.1007/978-3-031-26284-5_30

Download citation

DOI: https://doi.org/10.1007/978-3-031-26284-5_30
Published: 23 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26283-8
Online ISBN: 978-3-031-26284-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Shape Prior is Not All You Need: Discovering Balance Between Texture and Shape Bias in CNN