Abstract
In solving the black box attribute problem of neural networks, how to extract feature information in data and generalize inherent features of data are the focus of artificial intelligence research. Aiming at the problem of the weak generalization ability of large image transformation under deep convolutional networks, a new method for image robust recognition based on a feature-entropy-oriented differential fusion capsule network (DFC) is proposed, the core of which is feature entropy approximation. First, convolution feature entropy is introduced as the transformation metric at the feature extraction level, and a convolution difference scale space is constructed using a residual network to approximate the similar entropy. Then, based on this scale feature, convolution feature extraction in a lower scale space is carried out and fused with the last scale feature to form a convolution differential fusion feature. Finally, a capsule network is used to autonomously cluster using dynamic routing to complete the semantic learning of various high-dimensional features, thereby further enhancing the recognition robustness. Experimental results show that feature entropy can effectively evaluate the transformation image recognition effect, and the DFC is effective for robust recognition with large image transformations such as image translation, rotation, and scale transformation.
Similar content being viewed by others
References
Browne M, Ghidary SS (2003) Convolutional neural networks for image processing: an application in robot vision[C]. In: Australasian Joint Conference on Artificial Intelligence. Springer, Berlin, pp 641–652
Deng S, Tian Y, Hu X, et al (2012) Application of new advanced CNN structure with adaptive thresholds to color edge detection[J]. Commun. Nonlinear Sci. Numer. Simul. 17(4):1637–1648
Zhang K, Zuo W, Chen Y, et al (2017) Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising[J]. IEEE Trans. Image Process. 26(7):3142–3155
Radenović F, Tolias G, Chum O, Fine-tuning CNN (2018) image retrieval with no human annotation[J]. IEEE Trans Pattern Anal Mach Intell 41(7):1655–1668
Chen M C, Ball R L, Yang L, et al (2018) Deep learning to classify radiology free-text reports[J]. Radiology 286(3):845–852
Xiong Z, Shen Q, Wang Y, et al (2018) Paragraph vector representation based on word to vector and CNN learning[J]. Computers, Materials & Continua 55(2):213–227
Lin K W E, Balamurali B T, Koh E, et al (2020) Singing voice separation using a deep convolutional neural network trained by ideal binary mask and cross entropy[J]. Neural Comput. and Applic. 32(4):1037–1050
Kaneko T, Kameoka H, Hiramatsu K, et al (2017) Sequence-to-sequence voice conversion with similarity metric learned using generative adversarial networks[C]. INTERSPEECH 2017:1283–1287
Xu Z, Yang Y, Hauptmann AG (2015) A discriminative CNN video representation for event detection[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1798–1807
Chéron G, Laptev I, Schmid C (2015) P-cnn: Pose-based cnn features for action recognition[C]. Proceedings of the IEEE international conference on computer vision, pp 3218–3226
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks[C]. In: European conference on computer vision. Springer, Cham, pp 818–833
Shen X, Tian X, He A, et al (2016) Transform-invariant convolutional neural networks for image classification and search[C]. In: Proceedings of the 24th ACM international conference on Multimedia, pp 1345–1354
Azulay A, Weiss Y (2019) Why do deep convolutional networks generalize so poorly to small image transformations?[J]. J. Mach. Learn. Res. 20(184):1–25
Pan R (2019) Static deep neural network analysis for robustness[C]. In: Proceedings of the 2019 27th ACM Joint meeting on european software engineering conference and symposium on the foundations of software engineering, pp 1238–1240
Cohen TS, Welling M (2015) Transformation properties of learned visual representations ICLR
Mikołajczyk A , Grochowski M (2018) Data augmentation for improving deep learning in image classification problem[C]. In: 2018 International Interdisciplinary PhD workshop (IIPhDW). IEEE, pp 117–122
Lenc K, Vedaldi A (2015) Understanding image representations by measuring their equivariance and equivalence[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 991–999
Cheng G, Zhou P, Han J (2016) Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images[J]. IEEE Trans. Geosci. Remote Sens. 54(12):7405–7415
Hou X, Gong Y, Liu B, et al (2018) Learning based image transformation using convolutional neural networks[J]. IEEE Access 6:49779–49792
Jia W, Zhao D, Shen T, et al (2015) An optimized classification algorithm by BP neural network based on PLS and HCA[J]. Appl. Intell. 43(1):176–191
Tomandl D, Schober A (2001) A modified general regression neural network (MGRNN) with new, efficient training algorithms as a robust ‘black box’-tool for data analysis[J]. Neural networks 14(8):1023–1034
Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules[C]. In: Advances in neural information processing systems, pp 3856–3866
Sabour S, Frosst N, Hinton G (2018) Matrix capsules with EM routing[C]. In: 6th International conference on learning representations, ICLR, pp 1–15
Yang F, Li W, Tang W, et al (2018) The analysis between traditional convolution neural network and CapsuleNet[C]. In: 2018 International Conference on Control, Automation and Information Sciences (ICCAIS). IEEE, pp 210–215
Xiang C, Zhang L, Tang Y, et al (2018) MS-CapsNet: A novel multi-scale capsule network[J]. IEEE Signal Processing Letters 25(12):1850–1854
He Kaiming, Zhang Xiangyu, Ren Shaoqing, Sun Jian (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Oyallon E, Belilovsky E, Zagoruyko S (2017) Scaling the scattering transform: Deep hybrid networks[C]. In: 2017 IEEE International Conference on Computer Vision (ICCV). IEEE
Phillips SJ, Anderson RP, Schapire RE (2006) Maximum entropy modeling of species geographic distributions[J]. Ecol. Model. 190(3-4):231–259
Xiao C, Zhu W (2007) Threshold selection algorithm for image segmentation based on Otsu rule and image entropy[J]. Computer Engineering
Xing S, Liu F, Zhao X, et al (2018) Points-of-interest recommendation based on convolution matrix factorization[J]. Applied Intelligence 48(8):2458–2469
Acknowledgments
This paper is supported by the Nanjing Institute of Technology High-level Scientific Research Foundation for the Introduction of Talent (No. YKJ201918), the Science Foundation for Young Scientists of Jiangsu (BK20181017, BK20181018), and partially supported by the National Natural Science Foundation of China (No.61806175,61902179,51675259).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Qian, K., Tian, L., Liu, Y. et al. Image robust recognition based on feature-entropy-oriented differential fusion capsule network. Appl Intell 51, 1108–1117 (2021). https://doi.org/10.1007/s10489-020-01873-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-020-01873-3