Skip to main content
Log in

GSNet: Group Sequential Learning for Image Recognition

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

In recent years, deep learning has achieved great successes in the field of image cognitive learning, and designing a well-behaved convolutional neural network (CNN)-based architecture has become a challenging and important problem. The traditional group convolution cannot effectively address the severe problem of “information blocking”; hence, this work proposes an efficient CNN-based model to achieve an effective exchange of information between channels. A novel Group Sequential (GS) learning that uses a channel split operation and sequential learning methods is introduced to improve the recognition performance by increasing information communication. Several state-of-the-art models are developed based on GS blocks, and these blocks significantly boost the performance and robustness of the CNN-based models. Extensive experiments are carried out to evaluate the promising performance of the proposed GSNet framework, and experimental results show its superiority on several benchmarks (i.e., the CIFAR-10, CIFAR-100, Tiny ImageNet, ImageNet, and FOOD-101 dataset). Moreover, compared with traditional residual networks (e.g., ResNet-101), the proposed network has achieved a great improvement with fewer parameters, and the error rate of models on the FOOD-101 dataset decreases from 19.08 to 16.02%. The proposed GS block method has significant potential to improve the performance for image recognition, and advance the development of cognitive computation. The results demonstrate the superiority of the proposed method and indicate excellent generalization ability. Code is available at: https://github.com/shao15xiang/GSNet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Cao J, Pang Y, Li X. Pedestrian detection inspired by appearance constancy and shape symmetry. in Proc. IEEE Int Conf Comput Vis Pattern Recognit. 2016;1316–1324.

  2. Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell. 2015;39(4):640–51.

    Google Scholar 

  3. Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. In Advances in neural information processing systems. 2012;097–1105.

  4. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. in Proc Int Conf Learn Represent. 2015.

  5. Szegedy C, Liu W, Jia YQ, Sermanet P, Reed S, Anguelov D, et al. Going deeper with convolutions. in Proc Int Conf IEEE Comput Vis Pattern Recognit. 2015;1–9.

  6. He KM, Zhang XY, Ren SQ, Sun J. Deep residual learning for image recognition. in Proc Int Conf IEEE Comput Vis Pattern Recognit. 2016;770–778.

  7. Garagnani M, Wennekers T, Pulvermüller F. Recruitment and consolidation of cell assemblies for words by way of hebbian learning and competition in a multi-layer neural network. Cogn Comput. 2009;1(2):160–76.

    Article  Google Scholar 

  8. Xie S, Girshick R, Dollár P, Tu Z, He K. Aggregated residual transformations for deep neural networks. in Proc Int Conf IEEE Comput Vis Pattern Recognit. 2017;1492–1500.

  9. Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, et al. Decaf: A deep convolutional activation feature for generic visual recognition. In ICML. 2014.

  10. Ba J, Swersky K, Fidler S, Salakhutdinov R. Predicting deep zero-shot convolutional neural networks using textual descriptions,” in Proc IEEE Int Conf Comput Vision. 2015;4247–4255.

  11. Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proc Int Conf IEEE Comput Vis Pattern Recognit. 2014;580–587.

  12. Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In Proc. In Int Conf IEEE Comput Vis Pattern Recognit. 2015;3431–3440.

  13. Pinheiro PO, Collobert R, Dollar P. Learning to segment object candidates. In NIPS. 2015.

  14. Xiong W, Droppo J, Huang X, Seide F, Seltzer M, Stolcke A, et al. The Microsoft 2016 Conversational Speech Recognition System. arXiv:1609.03528. 2016.

  15. Oord A, Dieleman S, Zen H, Simonyan K, Vinyals O, Graves A, et al. Wavenet: A generative model for raw audio. arXiv:1609.03499. 2016.

  16. Conneau A, Schwenk H, Barrault L, Lecun Y. Very deep convolutional networks for natural language processing. arXiv:1606.01781. 2016.

  17. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the Inception Architecture for Computer Vision. in Proc IEEE Int’l Conf on Computer Vision. 2016;2818–2826.

  18. Szegedy C, Ioffe S, Vanhoucke V, Alexander AA. Inception-v4, Inception-ResNet and the impact of residual connections on learning. AAAI Conference on Artificial Intelligence. 2017;4278–4284.

  19. Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167. 2015.

  20. Huang G, Liu Z, Maaten LVD, Weinberger KQ. Densely connected convolutional networks. In Proc In Int Conf IEEE Comput Vis Pattern Recognit. 2017;4700–4708.

  21. Zagoruyko S, Komodakis N. Wide residual networks. arXiv preprint arXiv:1605.07146. 2016.

  22. Hu J, Shen L, Sun G. Squeeze-and-excitation betworks. In Proc. In Int Conf IEEE Comput Vis Pattern Recognit. 2018;7132–7141.

  23. Xie SN, Girshick R, Dollar P, Tu ZW, He KM. Aggregated residual transformations for deep neural networks. In Proc In Int Conf IEEE Comput Vis Pattern Recognit. 2017;1492–1500.

  24. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, et al. ImageNet large scale visual recognition challenge. Int J Comput Vision. 2015;115(3):211–52.

    Article  MathSciNet  Google Scholar 

  25. Luo JH, Hao Z, Zhou HY, Xie CW, Wu JX, Lin JX. ThiNet: pruning CNN filters for a thinner net. IEEE Trans Pattern Anal Mach Intell. 2018;41(10):2525–38.

    Article  Google Scholar 

  26. Targ S, Almeida D, Lyman K. Resnet in resnet: generalizing residual architectures. arXiv preprint arXiv:1603.08029. 2016.

  27. Larsson G, Maire M, Shakhnarovich G. FractalNet: ultra-deep neural networks without residuals. arXiv preprint arXiv:1605.07648. 2016.

  28. Zhang K, Sun M, Han X, Yuan XF, Guo LR, Liu T. Residual networks of residual networks: multilevel residual networks. IEEE Trans Circuits Syst Video Technol. 2017;28(6):1303–14.

    Article  Google Scholar 

  29. Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images. Univ. Toronto, Toronto, ON, Canada. Tech Rep. 2009.

  30. Tiny Imagenet Challenge [Online]. Available: https://tinyimagenet.herokuapp.com.

  31. Bossard L, Guillaumin M, Van Gool L. Food-101—mining discriminative components with random forests. in Proc Eur Conf Comput Vis. 2014;446–461.

  32. Xie GT, Yang KY, Lai JH. Filter-in-filter: low cost CNN improvement by sub-filter parameter sharing. Pattern Recogn. 2019;91:391–403.

    Article  Google Scholar 

  33. Ma NN, Zhang XY, Zheng HT, Sun J. ShuffleNet V2: practical guidelines for efficient CNN architecture design. The European Conference on Computer Vision (ECCV). 2018;116–131.

  34. Luan S, Chen C, Zhang B, Han J, Liu J. Gabor Convolutional networks. IEEE Trans Image Process. 2017;27(9):4357–66.

    Article  MathSciNet  Google Scholar 

  35. Liao ZB, Carneiro G. A deep convolutional neural network module that promotes competition of multiple-size filters. Pattern Recogn. 2017;71:94–105.

    Article  Google Scholar 

  36. Chollet F. Xception: Deep learning with depthwise separable convolutions. In Proc In Int Conf IEEE Comput Vis Pattern Recognit. 2017;1251–1258.

  37. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, et al. Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861. 2017.

  38. Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC. Inverted residuals and linear bottlenecks: mobile networks for classification, detection and segmentation. arXiv preprint arXiv:1801.04381. 2018.

  39. Howard A, Sandler M, Chu G, Chen L , Chen B, Tan M, et al. Searching for MobileNetV3. arXiv preprint arXiv:1905.02244. 2019.

  40. Zhang X, Zhou X, Lin M, Sun J. Shufflenet: an extremely efficient convolutional neural network for mobile devices. arXiv preprint arXiv:1707.01083. 2017.

  41. Ioannou Y, Robertson D, Cipolla R, Criminisi A. Deep roots: improving CNN efficiency with hierarchical filter groups. arXiv preprint arXiv:1605.06489. 2016.

  42. Zhang T, Qi GJ, Xiao B, Wang J. Interleaved group convolutions for deep neural networks. In: International Conference on Computer Vision. 2017.

  43. Xie G, Wang J, Zhang T. Lai J, Hong R, Qi GJ. Igcv 2: interleaved structured sparse convolutional neural networks. arXiv preprint arXiv:1804.06202. 2018.

  44. Sun K, Li M, Liu D, Wang J. Igcv3: Interleaved low-rank group convolutions for efficient deep neural networks. arXiv preprint arXiv:1806.00178. 2018.

  45. Molchanov P, Tyree S, Karras T, Aila T, Kautz J. Pruning convolutional neural networks for resource efficient inference. arXiv preprint arXiv:1611.06440. 2016.

  46. Lin M, Chen Q, Yan S. Network in network. arXiv:1312.4400. 2013.

  47. Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y. FitNets: hints for thin deep nets. arXiv preprint arXiv:1412.6550. 2014.

  48. Lee CY, Xie S, Gallagher P, Zhang Z, Tu Z. Deeply-supervised nets. in Proc. AISTATS. 2015;562–570.

  49. Huang G, Sun Y, Liu Z, Sedra D, Weinberger KQ. Deep networks with stochastic depth. In ECCV. 2016.

  50. He K, Zhang X, Ren S, Sun J. Identity mapping in deep residual networks. arXiv preprint arXiv:1603.05027. 2016.

  51. Wang X , Bao A, Cheng Y, Yu Q. Multipath ensemble convolutional neural network. IEEE Transactions on Emerging Topics in Computational Intelligence. 2018;1–9.

  52. Yang Y, Wu QMJ, Feng X, Akilan T. Non-iterative recomputation of dense layers for performance improvement of DCNN. arXiv preprint arXiv:1809.05606. 2018.

  53. Weng Y, Zhou T, Liu L, Xia CL. Automatic convolutional neural architecture search for image classification under different scenes. IEEE Access. 2019;7:38495–506.

    Article  Google Scholar 

  54. Sutskever I, Martens J, Dahl G, Hinton G. On the importance of initialization and momentum in deep learning. in Proc 30th Int Conf Mach Learn. 2013;1139–1147.

  55. Felix JX, Vishnu NB, Marios S. Local binary convolutional neural networks. In Proc In Int Conf IEEE Comput Vis Pattern Recognit. 2017;19–28.

  56. Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A. Learning deep features for discriminative localization. In Proc In Int Conf IEEE Comput Vis Pattern Recognit. 2016;2921–2929.

  57. Gao Y, Beijbom O, Zhang N, Darrell T. Compact bilinear pooling. In Proc In Int Conf IEEE Comput Vis Pattern Recognit. 2017;317–326.

  58. Yunus R, Arif O, Afzal H, Amjad MF, Abbas H, Bokhari HN, et al. A framework to estimate the nutritional value of food in real time using deep learning techniques. IEEE Access. 2019;7:2643–52.

    Article  Google Scholar 

  59. McAllister P, Zheng H, Bond R, Moorhead A. Combining deep residual neural network features with supervised machine learning algorithms to classify diverse food image datasets. Comput Biol Med. 2018;95:217–33.

    Article  Google Scholar 

  60. Mandal B, Puhan NB, Verma A. Deep Convolutional Generative Adversarial Network-Based Food Recognition Using Partially Labeled Data. IEEE Sensors Letters. 2018;1–1.

  61. Zhang B, Gu J, Chen C, Han J, Su X, Cao X, et al. One-two-one networks for compression artifacts reduction in remote sensing. ISPRS J Photogramm Remote Sens. 2018;145:184–96.

    Article  Google Scholar 

  62. Zhang B, Perina A, Li Z, Murino V, Liu J, Ji R. Bounding multiple Gaussians uncertainty with application to object tracking. Int J Comput Vis. 2016;118(3):364–79.

    Article  MathSciNet  Google Scholar 

Download references

Funding

This work was supported in part by the National Natural Science Foundation of China (NSFC 62073129, 61673163), Chang-Zhu-Tan National Indigenous Innovation Demonstration Zone Project (2017XK2102).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiaokang Liang.

Ethics declarations

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed Consent

Informed consent was obtained from all individual participants included in the study.

Conflict of Interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiang, S., Liang, Q., Sun, W. et al. GSNet: Group Sequential Learning for Image Recognition. Cogn Comput 13, 538–551 (2021). https://doi.org/10.1007/s12559-020-09815-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-020-09815-4

Keywords

Navigation