A general multi-scale image classification based on shared conversion matrix routing

Wang, Yuxiao; Li, Kai; Lei, Yu

doi:10.1007/s10489-021-02558-1

A general multi-scale image classification based on shared conversion matrix routing

Published: 01 July 2021

Volume 52, pages 3249–3265, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

331 Accesses
2 Citations
Explore all metrics

Abstract

For the deep convolutional neural network, the input image needs to be fixed to the corresponding size due to the existence of the fully connected layer. Stretching and clipping can make the image reach the required size, but these operations can easily distort the image. The methods based on pooling layer groups enable variable-size feature maps to be converted into fixed-size. However, there is a loss of information due to the pooling operations, and the recognition accuracy will be significantly reduced. Based on this problem, we propose the shared conversion matrix routing (SCMR) layer as a general network layer to replace the fully connected layer of the convolutional neural network, which can enable the network added to this layer to deal with multi-scale image problems without changing the original convolution structure and parameters. In the SCMR layer, we propose a RECOMBINATION method which dynamically increases or decreases the number of capsules according to the scale of the input image to ensure the normal operation of the convolutional layer and the SCMR layer. At the same time, a new dynamic routing algorithm is established by sharing the transformation matrix in the SCMR layer so that the SCMR layer can receive the convolutional multi-dimensional feature map and generate fixed-size image features output to realize the classification of multi-scale images. The algorithm makes each capsule have a corresponding weight to avoid the problem of feature loss, which improves the recognition rate. In addition, new capsules are created by increasing the dimensions of capsules to solve the exploding gradient problem in backpropagation. The experimental results show that the accuracy of the method proposed in this paper is better than the modern methods on public datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image robust recognition based on feature-entropy-oriented differential fusion capsule network

Article 16 September 2020

Kui Qian, Lei Tian, … Jiatong Bao

Multi-level Dense Capsule Networks

Dense capsule networks with fewer parameters

Article 12 April 2021

Kun Sun, Xianbin Wen, … Haixia Xu

References

Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Advances in Neural Information Processing Systems 2:2672–2680
Google Scholar
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:151106434
Tran QN, Yang SH (2020) Efficient video frame interpolation using generative adversarial networks. Appl Sci 10(18):6245. https://doi.org/10.3390/app10186245
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916. https://doi.org/10.1109/TPAMI.2015.2389824
Article Google Scholar
Ren S, He K, Girshick R, Sun J (2016) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
Article Google Scholar
He K, Gkioxari G, Dollár P, Girshick R (2020) Mask R-CNN. IEEE Trans Pattern Anal Mach Intell 42(2):386–397. https://doi.org/10.1109/TPAMI.2018.2844175
Article Google Scholar
Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. Advances in Neural Information Processing Systems, In, pp 379–387
Google Scholar
Jiang B, Luo R, Mao J, Xiao T, Jiang Y (2018) Acquisition of localization confidence for accurate object detection. In: European Conference on Computer Vision (ECCV), pp 784-799. https://doi.org/10.1007/978-3-030-01264-9_48
Wang S, Liu Y, He Z, Wang Y, Tang Z (2020) A quadrilateral scene text detector with two-stage network architecture. Pattern Recogn 102:107230. https://doi.org/10.1016/j.patcog.2020.107230
Article Google Scholar
Han X, He T, Ong YS, Zhong Y (2020) Precise object detection using adversarially augmented local/global feature fusion. Eng Appl Artif Intell 94:103710. https://doi.org/10.1016/j.engappai.2020.103710
Article Google Scholar
Wang F, Xu Z, Gan Y, Vong CM, Liu Q (2020) SCNet: scale-aware coupling-structure network for efficient video object detection. Neurocomputing 404:283–293. https://doi.org/10.1016/j.neucom.2020.03.110
Article Google Scholar
Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. Advances in Neural Information Processing Systems, In, pp 3859–3869
Google Scholar
Khan A, Zubair S (2020) Expansion of regularized kmeans discretization machine learning approach in prognosis of dementia progression. In: International Conference on Computing. Communication and Networking Technologies (ICCCNT), IEEE, pp 1–6. https://doi.org/10.1109/ICCCNT49239.2020.9225397
Chapter Google Scholar
Li X, Zhang R, Wang Q, Zhang H (2020) Autoencoder constrained clustering with adaptive neighbors. IEEE Trans Neural Netw Learn Syst 32(1):443–449. https://doi.org/10.1109/TNNLS.2020.2978389
Article Google Scholar
Yang X, Deng C, Zheng F, Yan J, Liu W (2019) Deep spectral clustering using dual autoencoder network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, In, pp 4066–4075. https://doi.org/10.1109/cvpr.2019.00419
Book Google Scholar
Zhang B, Qian J (2021) Autoencoder-based unsupervised clustering and hashing. Appl Intell 51(1):493–505. https://doi.org/10.1007/s10489-020-01797-y
Article MathSciNet Google Scholar
Ghasedi K, Wang X, Deng C, Huang H (2019) Balanced self-paced learning for generative adversarial clustering network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, In, pp 4391–4400. https://doi.org/10.1109/cvpr.2019.00452
Book Google Scholar
Zhou R, Shen YD (2020) End-to-end adversarial-attention network for multi-modal clustering. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, In, pp 14619–14628. https://doi.org/10.1109/cvpr42600.2020.01463
Book Google Scholar
Mittal H, Pandey AC, Pal R, Tripathi A (2021) A new clustering method for the diagnosis of CoVID19 using medical images. Appl Intell 51(5):2988–3011. https://doi.org/10.1007/s10489-020-02122-3
Article Google Scholar
Qi C, Zhang J, Jia H, Mao Q, Wang L, Song H (2021) Deep face clustering using residual graph convolutional network. Knowledge-Based Syst 211:106561. https://doi.org/10.1016/j.knosys.2020.106561
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60:84–90. https://doi.org/10.1145/3065386
Article Google Scholar
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. European conference on computer vision. Springer, pp 818-833. https://doi.org/10.1007/978-3-319-10590-1_53
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:14091556
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. Proceedings of the IEEE conference on computer vision and pattern recognition, In, pp 1–9. https://doi.org/10.1109/CVPR.2015.7298594
Book Google Scholar
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. International conference on machine learning. PMLR, In, pp 448–456
Google Scholar
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. Proceedings of the IEEE conference on computer vision and pattern recognition, In, pp 2818–2826. https://doi.org/10.1109/CVPR.2016.308
Book Google Scholar
Szegedy C, Ioffe S, Vanhoucke V, Alemi A (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: proceedings of the AAAI conference on artificial intelligence, vol 31
Zhu M, Jiao L, Liu F, Yang S, Wang J (2020) Residual spectral–spatial attention network for hyperspectral image classification. IEEE Trans Geosci Remote Sensing 59(1):449–462. https://doi.org/10.1109/TGRS.2020.2994057
Article Google Scholar
Tong W, Chen W, Han W, Li X, Wang L (2020) Channel-attention-based DenseNet network for remote sensing image scene classification. IEEE J Sel Top Appl Earth Observ Remote Sens 13:4121–4132. https://doi.org/10.1109/JSTARS.2020.3009352
Article Google Scholar
Lu Z, Xu B, Sun L, Zhan T, Tang S (2020) 3-D channel and spatial attention based multiscale spatial–spectral residual network for hyperspectral image classification. IEEE J Sel Top Appl Earth Observ Remote Sens 13:4311–4324. https://doi.org/10.1109/JSTARS.2020.3011992
Article Google Scholar
Zoran D, Chrzanowski M, Huang PS, Gowal S, Mott A, Kohli P (2020) Towards robust image classification using sequential attention models. Proceedings of the IEEE conference on computer vision and pattern recognition, In, pp 9483–9492. https://doi.org/10.1109/CVPR42600.2020.00950
Book Google Scholar
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2017) Towards deep learning models resistant to adversarial attacks. arXiv:170606083
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition, In, pp 770–778. https://doi.org/10.1109/CVPR.2016.90
Book Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European conference on computer vision. Springer, pp. 630–645. https://doi.org/10.1007/978-3-319-46493-0_38
Xie S, Girshick R, Dollár P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. Proceedings of the IEEE conference on computer vision and pattern recognition, In, pp 1492–1500. https://doi.org/10.1109/CVPR.2017.634
Book Google Scholar
Cao X, Yao J, Xu Z, Meng D (2020) Hyperspectral image classification with convolutional neural network and active learning. IEEE Trans Geosci Remote Sensing 58(7):4604–4616. https://doi.org/10.1109/TGRS.2020.2964627
Article Google Scholar
Yang H, Song K, Mao F, Yin Z (2020) Autolabeling-enhanced active learning for cost-efficient surface defect visual classification. IEEE Trans Instrum Meas 70:1–15. https://doi.org/10.1109/TIM.2020.3032190
Article Google Scholar
Hinton GE, Krizhevsky A, Wang SD (2011) Transforming auto-encoders. International conference on artificial neural networks. Springer, pp 44-51. https://doi.org/10.1007/978-3-642-21735-7_6
Hinton GE, Sabour S, Frosst N (2018) Matrix capsules with EM routing. International conference on learning representations, In
Google Scholar
Bahadori MT (2018) Spectral capsule networks. International conference on learning representations, In
Google Scholar
Wang D, Liu Q (2018) An optimization view on dynamic routing between capsules. International conference on learning representations, In
Google Scholar
Phaye SSR, Sikka A, Dhall A, Bathula D (2018) Dense and diverse capsule networks: making the capsules learn better. arXiv:180504001
Xi E, Bing S, Jin Y (2017) Capsule network performance on complex data. arXiv:171203480
Deliege A, Cioppa A, Van Droogenbroeck M (2018) Hitnet: a neural network with capsules embedded in a hit-or-miss layer, extended with hybrid data augmentation and ghost capsules. arXiv:180606519
Neill JO (2018) Siamese capsule networks. arXiv:180507242
Sahu SK, Kumar P, Singh AP (2018) Dynamic routing using inter capsule routing protocol between capsules. In: 2018 UKSim-AMSS 20th international conference on computer modelling and simulation (UKSim). IEEE, pp 1-5. https://doi.org/10.1109/UKSim.2018.00012
Lenssen JE, Fey M, Libuschewski P (2018) Group equivariant capsule networks. In: Advances in Neural Information Processing Systems, vol 31
Qiao K, Zhang C, Wang L, Chen J, Zeng L, Tong L, Yan B (2018) Accurate reconstruction of image stimuli from human functional magnetic resonance imaging based on the decoding model with capsule network architecture. Front neuroinformatics 12:62. https://doi.org/10.3389/fninf.2018.00062
Article Google Scholar
Afshar P, Mohammadi A, Plataniotis KN (2018) Brain tumor type classification via capsule networks. In: IEEE international conference on image processing (ICIP). IEEE, pp 3129-3133. https://doi.org/10.1109/ICIP.2018.8451379
Iesmantas T, Alzbutas R (2018) Convolutional capsule network for classification of breast cancer histology images. In: International Conference Image Analysis and Recognition. Springer, pp. 853–860. https://doi.org/10.1007/978-3-319-93000-8_97
Kosiorek AR, Sabour S, Teh YW, Hinton GE (2019) Stacked capsule autoencoders. Advances in Neural Information Processing Systems, In, pp 15512–15522
Google Scholar
Duarte K, Rawat YS, Shah M (2018) VideoCapsuleNet: a simplified network for action detection. Advances in Neural Information Processing Systems, In, pp 7610–7619
Google Scholar
Pugeault N, Bowden R (2011) Spelling it out: real-time ASL fingerspelling recognition. In: IEEE International conference on computer vision workshops (ICCV workshops). IEEE, pp 1114-1119. https://doi.org/10.1109/ICCVW.2011.6130290

Download references

Acknowledgements

This work was supported by the Natural Science Foundation of Hebei Province (NO. F2018201060) and the Post-graduate’s Innovation Fund Project of Hebei University (HBU2021ss059).

Author information

Authors and Affiliations

School of Cyber Security and Computer, Hebei University, Baoding, 071002, China
Yuxiao Wang, Kai Li & Yu Lei

Authors

Yuxiao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Kai Li
View author publications
You can also search for this author in PubMed Google Scholar
Yu Lei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kai Li.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Y., Li, K. & Lei, Y. A general multi-scale image classification based on shared conversion matrix routing. Appl Intell 52, 3249–3265 (2022). https://doi.org/10.1007/s10489-021-02558-1

Download citation

Accepted: 21 May 2021
Published: 01 July 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s10489-021-02558-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A general multi-scale image classification based on shared conversion matrix routing

Abstract

Access this article

Similar content being viewed by others

Image robust recognition based on feature-entropy-oriented differential fusion capsule network

Multi-level Dense Capsule Networks

Dense capsule networks with fewer parameters

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A general multi-scale image classification based on shared conversion matrix routing

Abstract

Access this article

Similar content being viewed by others

Image robust recognition based on feature-entropy-oriented differential fusion capsule network

Multi-level Dense Capsule Networks

Dense capsule networks with fewer parameters

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation