Skip to main content
Log in

Local channel transformation for efficient convolutional neural network

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

The efficiency of the convolutional neural network (CNN) model is one of the biggest limitations when CNN is applied to mobile devices. As an efficient convolution method, the pointwise convolution is widely used in many networks to achieve the expansion and compression of channel dimensions. Nevertheless, pointwise convolution still needs to consume a vast number of calculations and parameters. In this paper, depthwise channel ascent (DCA) and group channel descent (GCD) were introduced as efficient channel transformation methods to replace pointwise convolution. DCA and GCD decompose the global channel transformation into local channel transformation. DCA utilizes spatial features to expand channel dimension, while GCD utilizes non-learning channel compression to decrease channel dimension. Compared with other counterparts, networks equipped with DCA and GCD can significantly reduce the calculations and parameters while maintaining competitive accuracy. The performance of the proposed method had been verified in different networks and multiple datasets. In the DCASE2019 dataset, the proposed method can reduce the parameters and calculation of MobileNetV2 to 41.9% and 30.2% and accelerate the inference speed to 73.8%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Ayinde, B.O., Inanc, T., Zurada, J.M.: Redundant feature pruning for accelerated inference in deep neural networks. Neural Netw. 118, 148–158 (2019)

    Article  Google Scholar 

  2. Chen, W., Wilson, J., Tyree, S., Weinberger, K., Chen, Y.: Compressing neural networks with the hashing trick. In: International Conference on Machine Learning, pp. 2285–2294. PMLR (2015)

  3. Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C.: Ghostnet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580–1589 (2020)

  4. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  5. Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: efficient convolutional neural networks for mobile vision applications. Preprint arXiv:1704.04861 (2017)

  6. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pages, pp. 4700–4708 (2017)

  7. Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Quantized neural networks: training neural networks with low precision weights and activations. J. Mach. Learn. Res. 18(1), 6869–6898 (2017)

    MATH  Google Scholar 

  8. Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)

  9. Li, X., Zhang, T., Zhao, X., Yi, Z.: Guided autoencoder for dimensionality reduction of pedestrian features. Appl. Intell. 50(12), 4557–4567 (2020)

    Article  Google Scholar 

  10. Li, Z., Hou, Y., Xie, X., Li, S., Zhang, L., Du, S., Liu, W.: Multi-level attention model with deep scattering spectrum for acoustic scene classification. In: 2019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 396–401. IEEE (2019)

  11. Liang, J., Zhang, T., Feng, G.: Channel compression: rethinking information redundancy among channels in cnn architecture. IEEE Access 8, 147265–147274 (2020)

    Article  Google Scholar 

  12. Liu, Y-X., Yang, Y., Shi, A., Jigang, P., Haowei, L.: Intelligent monitoring of indoor surveillance video based on deep learning. In: 2019 21st International Conference on Advanced Communication Technology (ICACT), pp. 648–653. IEEE (2019)

  13. Ma, N., Zhang, X., Zheng, H.T., Sun, J.: Shufflenet v2.: practical guidelines for efficient cnn architecture design. In: European Conference on Computer Vision (2018)

  14. Mesaros, A., Heittola, T., Virtanen, T.: A multi-device dataset for urban acoustic scene classification. Preprint arXiv:1807.09840 (2018)

  15. Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning (2011)

  16. Radosavovic, I., Kosaraju, R.P., Girshick, R., He, K., Dollár, P.: Designing network design spaces. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10428–10436 (2020)

  17. Rigamonti, R., Sironi, A., Lepetit, V., Fua, P.: Learning separable filters. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2754–2761 (2013)

  18. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L-C.: Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

  19. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Preprint arXiv:1409.1556 (2014)

  20. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

  21. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)

  22. Tan, M., Le, Q.V., Efficientnet: rethinking model scaling for convolutional neural networks(2019)

  23. Wu, S., Li, G., Chen, F., Shi, L.: Training and inference with integers in deep neural networks. Preprint arXiv:1802.04680 (2018)

  24. Yu, R., Li, A., Chen, C-F., Lai, J-H., Morariu, V.I., Han, X., Gao, M., Lin, C-Y., Davis, L.S.: Nisp: pruning networks using neuron importance score propagation. In; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9194–9203 (2018)

  25. Zhang, T., Liang, J., Ding, B.: Acoustic scene classification using deep cnn with fine-resolution feature. Exp. Syst. Appl. 143, 113067 (2020)

    Article  Google Scholar 

  26. Zhang, X., Zhou, X., Lin, M., Sun. J.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. arXiv:1707.01083 (2017)

  27. Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)

Download references

Acknowledgements

The authors acknowledge the financial support from the National Natural Science Fund of China (Grant No. 62001323).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinhua Liang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, T., Li, S., Feng, G. et al. Local channel transformation for efficient convolutional neural network. SIViP 17, 129–137 (2023). https://doi.org/10.1007/s11760-022-02212-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-022-02212-4

Keywords

Navigation