Abstract
The efficiency of the convolutional neural network (CNN) model is one of the biggest limitations when CNN is applied to mobile devices. As an efficient convolution method, the pointwise convolution is widely used in many networks to achieve the expansion and compression of channel dimensions. Nevertheless, pointwise convolution still needs to consume a vast number of calculations and parameters. In this paper, depthwise channel ascent (DCA) and group channel descent (GCD) were introduced as efficient channel transformation methods to replace pointwise convolution. DCA and GCD decompose the global channel transformation into local channel transformation. DCA utilizes spatial features to expand channel dimension, while GCD utilizes non-learning channel compression to decrease channel dimension. Compared with other counterparts, networks equipped with DCA and GCD can significantly reduce the calculations and parameters while maintaining competitive accuracy. The performance of the proposed method had been verified in different networks and multiple datasets. In the DCASE2019 dataset, the proposed method can reduce the parameters and calculation of MobileNetV2 to 41.9% and 30.2% and accelerate the inference speed to 73.8%.
Similar content being viewed by others
References
Ayinde, B.O., Inanc, T., Zurada, J.M.: Redundant feature pruning for accelerated inference in deep neural networks. Neural Netw. 118, 148–158 (2019)
Chen, W., Wilson, J., Tyree, S., Weinberger, K., Chen, Y.: Compressing neural networks with the hashing trick. In: International Conference on Machine Learning, pp. 2285–2294. PMLR (2015)
Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C.: Ghostnet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580–1589 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: efficient convolutional neural networks for mobile vision applications. Preprint arXiv:1704.04861 (2017)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pages, pp. 4700–4708 (2017)
Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Quantized neural networks: training neural networks with low precision weights and activations. J. Mach. Learn. Res. 18(1), 6869–6898 (2017)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Li, X., Zhang, T., Zhao, X., Yi, Z.: Guided autoencoder for dimensionality reduction of pedestrian features. Appl. Intell. 50(12), 4557–4567 (2020)
Li, Z., Hou, Y., Xie, X., Li, S., Zhang, L., Du, S., Liu, W.: Multi-level attention model with deep scattering spectrum for acoustic scene classification. In: 2019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 396–401. IEEE (2019)
Liang, J., Zhang, T., Feng, G.: Channel compression: rethinking information redundancy among channels in cnn architecture. IEEE Access 8, 147265–147274 (2020)
Liu, Y-X., Yang, Y., Shi, A., Jigang, P., Haowei, L.: Intelligent monitoring of indoor surveillance video based on deep learning. In: 2019 21st International Conference on Advanced Communication Technology (ICACT), pp. 648–653. IEEE (2019)
Ma, N., Zhang, X., Zheng, H.T., Sun, J.: Shufflenet v2.: practical guidelines for efficient cnn architecture design. In: European Conference on Computer Vision (2018)
Mesaros, A., Heittola, T., Virtanen, T.: A multi-device dataset for urban acoustic scene classification. Preprint arXiv:1807.09840 (2018)
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning (2011)
Radosavovic, I., Kosaraju, R.P., Girshick, R., He, K., Dollár, P.: Designing network design spaces. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10428–10436 (2020)
Rigamonti, R., Sironi, A., Lepetit, V., Fua, P.: Learning separable filters. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2754–2761 (2013)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L-C.: Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Preprint arXiv:1409.1556 (2014)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Tan, M., Le, Q.V., Efficientnet: rethinking model scaling for convolutional neural networks(2019)
Wu, S., Li, G., Chen, F., Shi, L.: Training and inference with integers in deep neural networks. Preprint arXiv:1802.04680 (2018)
Yu, R., Li, A., Chen, C-F., Lai, J-H., Morariu, V.I., Han, X., Gao, M., Lin, C-Y., Davis, L.S.: Nisp: pruning networks using neuron importance score propagation. In; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9194–9203 (2018)
Zhang, T., Liang, J., Ding, B.: Acoustic scene classification using deep cnn with fine-resolution feature. Exp. Syst. Appl. 143, 113067 (2020)
Zhang, X., Zhou, X., Lin, M., Sun. J.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. arXiv:1707.01083 (2017)
Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)
Acknowledgements
The authors acknowledge the financial support from the National Natural Science Fund of China (Grant No. 62001323).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, T., Li, S., Feng, G. et al. Local channel transformation for efficient convolutional neural network. SIViP 17, 129–137 (2023). https://doi.org/10.1007/s11760-022-02212-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-022-02212-4