Skip to main content
Log in

Image robust recognition based on feature-entropy-oriented differential fusion capsule network

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

In solving the black box attribute problem of neural networks, how to extract feature information in data and generalize inherent features of data are the focus of artificial intelligence research. Aiming at the problem of the weak generalization ability of large image transformation under deep convolutional networks, a new method for image robust recognition based on a feature-entropy-oriented differential fusion capsule network (DFC) is proposed, the core of which is feature entropy approximation. First, convolution feature entropy is introduced as the transformation metric at the feature extraction level, and a convolution difference scale space is constructed using a residual network to approximate the similar entropy. Then, based on this scale feature, convolution feature extraction in a lower scale space is carried out and fused with the last scale feature to form a convolution differential fusion feature. Finally, a capsule network is used to autonomously cluster using dynamic routing to complete the semantic learning of various high-dimensional features, thereby further enhancing the recognition robustness. Experimental results show that feature entropy can effectively evaluate the transformation image recognition effect, and the DFC is effective for robust recognition with large image transformations such as image translation, rotation, and scale transformation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Browne M, Ghidary SS (2003) Convolutional neural networks for image processing: an application in robot vision[C]. In: Australasian Joint Conference on Artificial Intelligence. Springer, Berlin, pp 641–652

  2. Deng S, Tian Y, Hu X, et al (2012) Application of new advanced CNN structure with adaptive thresholds to color edge detection[J]. Commun. Nonlinear Sci. Numer. Simul. 17(4):1637–1648

    Article  MathSciNet  MATH  Google Scholar 

  3. Zhang K, Zuo W, Chen Y, et al (2017) Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising[J]. IEEE Trans. Image Process. 26(7):3142–3155

    Article  MathSciNet  MATH  Google Scholar 

  4. Radenović F, Tolias G, Chum O, Fine-tuning CNN (2018) image retrieval with no human annotation[J]. IEEE Trans Pattern Anal Mach Intell 41(7):1655–1668

    Article  Google Scholar 

  5. Chen M C, Ball R L, Yang L, et al (2018) Deep learning to classify radiology free-text reports[J]. Radiology 286(3):845–852

    Article  Google Scholar 

  6. Xiong Z, Shen Q, Wang Y, et al (2018) Paragraph vector representation based on word to vector and CNN learning[J]. Computers, Materials & Continua 55(2):213–227

    Google Scholar 

  7. Lin K W E, Balamurali B T, Koh E, et al (2020) Singing voice separation using a deep convolutional neural network trained by ideal binary mask and cross entropy[J]. Neural Comput. and Applic. 32(4):1037–1050

    Article  Google Scholar 

  8. Kaneko T, Kameoka H, Hiramatsu K, et al (2017) Sequence-to-sequence voice conversion with similarity metric learned using generative adversarial networks[C]. INTERSPEECH 2017:1283–1287

    Article  Google Scholar 

  9. Xu Z, Yang Y, Hauptmann AG (2015) A discriminative CNN video representation for event detection[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1798–1807

  10. Chéron G, Laptev I, Schmid C (2015) P-cnn: Pose-based cnn features for action recognition[C]. Proceedings of the IEEE international conference on computer vision, pp 3218–3226

  11. Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks[C]. In: European conference on computer vision. Springer, Cham, pp 818–833

  12. Shen X, Tian X, He A, et al (2016) Transform-invariant convolutional neural networks for image classification and search[C]. In: Proceedings of the 24th ACM international conference on Multimedia, pp 1345–1354

  13. Azulay A, Weiss Y (2019) Why do deep convolutional networks generalize so poorly to small image transformations?[J]. J. Mach. Learn. Res. 20(184):1–25

    MathSciNet  MATH  Google Scholar 

  14. Pan R (2019) Static deep neural network analysis for robustness[C]. In: Proceedings of the 2019 27th ACM Joint meeting on european software engineering conference and symposium on the foundations of software engineering, pp 1238–1240

  15. Cohen TS, Welling M (2015) Transformation properties of learned visual representations ICLR

  16. Mikołajczyk A , Grochowski M (2018) Data augmentation for improving deep learning in image classification problem[C]. In: 2018 International Interdisciplinary PhD workshop (IIPhDW). IEEE, pp 117–122

  17. Lenc K, Vedaldi A (2015) Understanding image representations by measuring their equivariance and equivalence[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 991–999

  18. Cheng G, Zhou P, Han J (2016) Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images[J]. IEEE Trans. Geosci. Remote Sens. 54(12):7405–7415

    Article  Google Scholar 

  19. Hou X, Gong Y, Liu B, et al (2018) Learning based image transformation using convolutional neural networks[J]. IEEE Access 6:49779–49792

    Article  Google Scholar 

  20. Jia W, Zhao D, Shen T, et al (2015) An optimized classification algorithm by BP neural network based on PLS and HCA[J]. Appl. Intell. 43(1):176–191

    Article  Google Scholar 

  21. Tomandl D, Schober A (2001) A modified general regression neural network (MGRNN) with new, efficient training algorithms as a robust ‘black box’-tool for data analysis[J]. Neural networks 14(8):1023–1034

    Article  Google Scholar 

  22. Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules[C]. In: Advances in neural information processing systems, pp 3856–3866

  23. Sabour S, Frosst N, Hinton G (2018) Matrix capsules with EM routing[C]. In: 6th International conference on learning representations, ICLR, pp 1–15

  24. Yang F, Li W, Tang W, et al (2018) The analysis between traditional convolution neural network and CapsuleNet[C]. In: 2018 International Conference on Control, Automation and Information Sciences (ICCAIS). IEEE, pp 210–215

  25. Xiang C, Zhang L, Tang Y, et al (2018) MS-CapsNet: A novel multi-scale capsule network[J]. IEEE Signal Processing Letters 25(12):1850–1854

    Article  Google Scholar 

  26. He Kaiming, Zhang Xiangyu, Ren Shaoqing, Sun Jian (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  27. Oyallon E, Belilovsky E, Zagoruyko S (2017) Scaling the scattering transform: Deep hybrid networks[C]. In: 2017 IEEE International Conference on Computer Vision (ICCV). IEEE

  28. Phillips SJ, Anderson RP, Schapire RE (2006) Maximum entropy modeling of species geographic distributions[J]. Ecol. Model. 190(3-4):231–259

    Article  Google Scholar 

  29. Xiao C, Zhu W (2007) Threshold selection algorithm for image segmentation based on Otsu rule and image entropy[J]. Computer Engineering

  30. Xing S, Liu F, Zhao X, et al (2018) Points-of-interest recommendation based on convolution matrix factorization[J]. Applied Intelligence 48(8):2458–2469

    Article  Google Scholar 

Download references

Acknowledgments

This paper is supported by the Nanjing Institute of Technology High-level Scientific Research Foundation for the Introduction of Talent (No. YKJ201918), the Science Foundation for Young Scientists of Jiangsu (BK20181017, BK20181018), and partially supported by the National Natural Science Foundation of China (No.61806175,61902179,51675259).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kui Qian.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qian, K., Tian, L., Liu, Y. et al. Image robust recognition based on feature-entropy-oriented differential fusion capsule network. Appl Intell 51, 1108–1117 (2021). https://doi.org/10.1007/s10489-020-01873-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-01873-3

Keywords

Navigation