Abstract
Translational equivariance, one of the properties of Convolutional neural networks(CNNs), directly reflects the coherence of the influence of input at each position on the output. By looking for changes in variability such as translational equivariance, it is possible to determine whether the direction of model fit is correct. A controllable location target is designed to verify the translationlal equivariance of a CNN and then the effect of the CNN’s parameters on positioning errors was investigated. Furthermore, A quantitative method called response index(ResIndex) is proposed in this paper. When the parameters of a CNN are determined, the distribution of the input signal response at each position in the heatmap can be obtained via simple algebraic calculations. Here we demonstrate that translational equivariance is primarily affected by the convolution boundary effect,which can be quantitatively assessed by the ResIndex. Experimental evidence for the Pearson correlation coefficient between the MSE and ResIndex demonstrates that our ResIndex is strongly negatively correlated with the MSE, with the mean Pearson correlation coefficient is -0.9282 on the CIFAR-10 and -0.7837 on COCO. For the first time, a unified quantitative evaluation index called the ResIndex is proposed to measure the translational equivariance of CNN. A complete mathematical derivation and a time-saving calculation method are given.
Similar content being viewed by others
References
Acharjya DP, Mitra A, Zaman N (2022) Deep learning in data analytics. Springer
AlAfandy KA, Omara H, Lazaar M et al (2020) Using classic networks for classifying remote sensing images: Comparative study. Adv Scie Technol Eng Syst J 5(5):770–780
Alguacil A, Pinto WG, Bauerheim M, et al (2021) Effects of boundary conditions in fully convolutional networks for learning spatio-temporal dynamics. In: Joint european conference on machine learning and knowledge discovery in databases, Springer, pp 102–117
Araujo A, Norris W, Sim J (2019) Computing receptive fields of convolutional neural networks. Distill 4(11):e21
Bhatt D, Patel C, Talsania H et al (2021) Cnn variants for computer vision: History, architecture, application, challenges and future scope. Electron 10(20):2470
Cao Z, Simon T, Wei SE, et al (2017) Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7291–7299
Chaman A, Dokmanic I (2021) Truly shift-invariant convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 3773–3783
Cheng B, Xiao B, Wang J, et al (2020) Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5386–5395
Chu X, Yang W, Ouyang W, et al (2017) Multi-context attention for human pose estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1831–1840
Cohen T, Welling M (2016) Group equivariant convolutional networks. In: International conference on machine learning, PMLR, pp 2990–2999
Deng J, Dong W, Socher R, et al (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, IEEE, pp 248–255
Geng Z, Sun K, Xiao B, et al (2021) Bottom-up human pose estimation via disentangled keypoint regression. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 14676–14686
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT press
Groos D, Ramampiaro H, Ihlen EA (2021) Efficientpose: Scalable single-person pose estimation. Appl Intell 51:2518–2533
Hamey LG (2015) A functional approach to border handling in image processing. In: 2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA), IEEE, pp 1–8
Han J, Ding J, Xue N, et al (2021) Redet: A rotation-equivariant detector for aerial object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2786–2795
He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Innamorati C, Ritschel T, Weyrich T et al (2020) Learning on the edge: Investigating boundary filters in cnns. Int J Comput Vis 128(4):773–782
Islam MA, Kowal M, Jia S, et al (2021) Global pooling, more than meets the eye: Position information is encoded channel-wise in cnns. In: Proceedings of the IEEE/CVF International conference on computer vision, pp 793–801
Kayhan OS, Gemert JCv (2020) On translation invariance in cnns: Convolutional layers can exploit absolute spatial location. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 14274–14285
Kaymak Ç, Uçar A (2019) A brief survey and an application of semantic image segmentation for autonomous driving. Handbook of Deep Learning Applications, pp 161–200
Knuth DE (1992) Two notes on notation. The American Mathematical Monthly 99(5):403. https://doi.org/10.2307/2325085
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Handbook of Systemic Autoimmune Diseases 1(4)
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Law H, Deng J (2018) Cornernet: Detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision (ECCV), pp 734–750
Lenc K, Vedaldi A (2015) Understanding image representations by measuring their equivariance and equivalence. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 991–999
Li B, Wu W, Wang Q, et al (2019) Evolution of siamese visual tracking with very deep networks. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, Long Beach, CA, USA, pp 16–20
Lin TY, Maire M, Belongie S et al (2014) Microsoft coco: Common objects in context. In: Fleet D, Pajdla T, Schiele B et al (eds) Computer Vision - ECCV 2014. Springer International Publishing, Cham, pp 740–755
Lin TY, Maire M, Belongie S, et al (2014) Microsoft coco: Common objects in context. In: European conference on computer vision, Springer, pp 740–755
Liu JJ, Hou Q, Cheng MM, et al (2020) Improving convolutional networks with self-calibrated convolutions. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10096–10105
Liu R, Jia J (2008) Reducing boundary artifacts in image deconvolution. In: 2008 15th IEEE International conference on image processing, IEEE, pp 505–508
Luo W, Li Y, Urtasun R, et al (2016) Understanding the effective receptive field in deep convolutional neural networks. Advances in neural information processing systems, p 29
Manfredi M, Wang Y (2020) Shift equivariance in object detection. In: European Conference on Computer Vision, Springer, pp 32–45
Mouton C, Myburgh JC, Davel MH (2020) Stride and translation invariance in cnns. In: Gerber A (ed) Artificial Intelligence Research. Springer International Publishing, Cham, pp 267–281
Nguyen AD, Choi S, Kim W, et al (2019) Distribution padding in convolutional neural networks. In: 2019 IEEE International Conference on Image Processing (ICIP), IEEE, pp 4275–4279
Shannon CE (1949) Communication in the presence of noise. Proc IRE 37(1):10–21
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Sun G, Ding S, Sun T et al (2022) A novel dense capsule network based on dense capsule layers. Appl Intell 52(3):3066–3076
Sun K, Xiao B, Liu D, et al (2019) Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5693–5703
Takahashi R, Matsubara T, Uehara K (2018) Ricap: Random image cropping and patching data augmentation for deep cnns. In: Asian conference on machine learning, PMLR, pp 786–798
Tian Z, Shen C, Chen H, et al (2019) Fcos: Fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636
Wang TH, Huang HJ, Lin JT, et al (2018) Omnidirectional cnn for visual place recognition and navigation. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), IEEE, pp 2341–2348
Wang Z, Chen J, Hoi SC (2020) Deep learning for image super-resolution: A survey. IEEE Trans Pattern Anal Mach Intell 43(10):3365–3387
Weiler M, Hamprecht FA, Storath M (2018) Learning steerable filters for rotation equivariant cnns. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 849–858
Wohlberg B (2016) Boundary handling for convolutional sparse representations. In: 2016 IEEE International Conference on Image Processing (ICIP), IEEE, pp 1833–1837
Worrall DE, Garbin SJ, Turmukhambetov D, et al (2017) Harmonic networks: Deep translation and rotation equivariance. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5028–5037
Xie E, Sun P, Song X, et al (2020) Polarmask: Single shot instance segmentation with polar representation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12193–12202
Xie S, Girshick R, Dollár P, et al (2017) Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1492–1500
Zaman N, Gaur L, Humayun M (2022) Approaches and applications of deep learning in virtual medical care. IGI Global
Zhang R (2019) Making convolutional networks shift-invariant again. In: International conference on machine learning, PMLR, pp 7324–7334
Zhou H, Zhang C, Zhang X et al (2023) Image classification based on quaternion-valued capsule network. Appl Intell 53(5):5587–5606
Zou N, Xiang Z, Chen Y et al (2019) Boundary-aware cnn for semantic segmentation. IEEE Access 7:114520–114528
Author information
Authors and Affiliations
Contributions
Peng Yang and Lingqin Kong conceived of the presented idea. Peng Yang, Ming Liu and Ge Tang developed the theory and performed the computations. Yuejin Zhao encouraged PengYang to investigate boundary effects and supervised the findings of this work. Ming Liu, Liquan Dong, Xuhong Chu, and Hui Mei verified the analytical methods. All authors discussed the results and contributed to the final manuscript.
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no known competing financial interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, P., Kong, L., Liu, M. et al. Response index: quantitative evaluation index of translational equivariance. Appl Intell 53, 28642–28654 (2023). https://doi.org/10.1007/s10489-023-05021-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-05021-5