Abstract
Lightweight CNN models aim to extend the application of deep learning from conventional image classification to mobile edge device-based image classification. However, the accuracy of lightweight CNN models currently is not as comparable as traditional large CNN models. To improve the accuracy of mobile platform-based image classification, we propose MobileACNet, a novel ACNet-based lightweight model based on MobileNetV3 (a popular lightweight CNN for image classification on mobile platforms). Our model adopts a similar idea to ACNet: consider global inference and local inference adaptively to improve the classification accuracy. We improve the MobileNetV3 by replacing the inverted residual block with our proposed adaptive inverted residual module (AIR). Experimental results show that our proposed MobileACNet can effectively improve the image classification accuracy by providing additional adaptive global inference on three public datasets, i.e., Cifar-100 dataset, Tiny ImageNet dataset, and a large-scale dataset ImageNet, for mobile-platform-based image classification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Tian, Y., et al.: Global context assisted structure-aware vehicle retrieval. IEEE Trans. Intell. Transp. Syst. (2020)
Tian, Y., Cheng, G., Gelernter, J., Shihao, Yu., Song, C., Yang, B.: Joint temporal context exploitation and active learning for video segmentation. Pattern Recogn. 100, 107158 (2020)
Tian, Y., Zhang, Y., Zhou, D., Cheng, G., Chen, W.-G., Wang, R.: Triple attention network for video segmentation. Neurocomputing 417, 202ā211 (2020)
Jiang, L., et al.: Underwater species detection using channel sharpening attention. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 4259ā4267 (2021)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84ā90 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770ā778 (2016)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet-level accuracy with 50x fewer parameters and \(<\)0.5 MB model size. arXiv preprint arXiv:1602.07360 (2016)
Zhou, J., Dai, H.-N., Wang, H.: Lightweight convolution neural networks for mobile edge computing in transportation cyber physical systems. ACM Trans. Intell. Syst. Technol. (TIST) 10(6), 1ā20 (2019)
Haque, W.A., Arefin, S., Shihavuddin, A.S.M., Hasan, M.A.: Deepthin: a novel lightweight CNN architecture for traffic sign recognition without GPU requirements. Expert Syst. Appl. 168, 114481 (2021)
Valueva, M.V., Nagornov, N.N., Lyakhov, P.A., Valuev, G.V., Chervyakov, N.I.: Application of the residue number system to reduce hardware costs of the convolutional neural network implementation. Math. Comput. Simul. 177, 232ā243 (2020)
He, Y., Li, T.: A lightweight CNN model and its application in intelligent practical teaching evaluation. In: MATEC Web of Conferences, vol. 309, p. 05016. EDP Sciences (2020)
Luo, J.-H., Wu, J., Lin, W.: Thinet: a filter level pruning method for deep neural network compression. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5058ā5066 (2017)
Shipeng, F., Li, Z., Liu, Z., Yang, X.: Interactive knowledge distillation for image classification. Neurocomputing 449, 411ā421 (2021)
Kaiser, L., Gomez, A.N., Chollet, F.: Depthwise separable convolutions for neural machine translation. arXiv preprint arXiv:1706.03059 (2017)
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251ā1258 (2017)
Howard, A., et al.: Searching for MobileNetV3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314ā1324 (2019)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision And Pattern Recognition, pp. 4510ā4520 (2018)
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211ā252 (2015)
Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C.: Ghostnet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580ā1589 (2020)
Mao, G., Anderson, B.D.O.: Towards a better understanding of large-scale network models. IEEE/ACM Trans. Netw. 20(2), 408ā421 (2011)
Hosseini, H., Xiao, B., Jaiswal, M., Poovendran, R.: On the limitation of convolutional neural networks in recognizing negative images. In: 2017 16th IEEE International Conference On Machine Learning And Applications (ICMLA), pp. 352ā358. IEEE (2017)
Dua, A., Li, Y., Ren, F.: Systolic-CNN: an OpenCL-defined scalable run-time-flexible FPGA accelerator architecture for accelerating convolutional neural network inference in cloud/edge computing. In: 2020 IEEE 28th Annual International Symposium on Field-programmable Custom Computing Machines (FCCM), p. 231. IEEE (2020)
Battaglia, P.W., et al.: Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261 (2018)
Zou, S., Chen, W., Chen, H.: Image classification model based on deep learning in internet of things. Wirel. Commun. Mob. Comput. 2020 (2020)
Wang, G., Wang, K., Lin, L.: Adaptively connected neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1781ā1790 (2019)
Liu, T., Ma, Y., Yang, W., Ji, W., Wang, R., Jiang, P.: Spatial-temporal interaction learning based two-stream network for action recognition. Inf. Sci. (2022)
Zong, M., Wang, R., Chen, Z., Wang, M., Wang, X., Potgieter, J.: Multi-cue based 3D residual network for action recognition. Neural Comput. Appl. 33(10), 5167ā5181 (2021)
Ji, W., Wang, R., Tian, Y., Wang, X.: An attention based dual learning approach for video captioning. Appl. Soft Comput. 117, 108332 (2022)
Ji, W., Wang, R.: A multi-instance multi-label dual learning approach for video captioning. ACM Trans. Multimidia Comput. Commun. Appl. 17(2s), 1ā18 (2021)
Zong, M., Wang, R., Chen, X., Chen, Z., Gong, Y.: Motion saliency based multi-stream multiplier resnets for action recognition. Image Vision Comput. 107, 104108 (2021)
Chen, Z., Wang, R., Zhang, Z., Wang, H., Lizhong, X.: Background-foreground interaction for moving object detection in dynamic scenes. Inf. Sci. 483, 65ā81 (2019)
Jing, C., Potgieter, J., Noble, F., Wang, R.: A comparison and analysis of RGB-D camerasā depth performance for robotics application. In: 2017 24th International Conference on Mechatronics and Machine Vision in Practice (M2VIP), pp. 1ā6. IEEE (2017)
Wang, L., et al.: Multi-cue based four-stream 3D resnets for video-based action recognition. Inf. Sci. 575, 654ā665 (2021)
Liu, Z., Li, Z., Wang, R., Zong, M., Ji, W.: Spatiotemporal saliency-based multi-stream networks with attention-aware LSTM for action recognition. Neural Comput. Appl. 32(18), 14593ā14602 (2020)
Shamsolmoali, P., et al.: Image synthesis with adversarial networks: a comprehensive survey and case studies. In. Fusion 72, 126ā146 (2021)
Hou, F., Wang, R., He, J., Zhou, Y.: Improving entity linking through semantic reinforced entity embeddings. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 6843ā6848. Association for Computational Linguistics (2020)
Hou, F., Wang, R., Zhou, Y.: Transfer learning for fine-grained entity typing. Knowl. Inf. Syst. 63(4), 845ā866 (2021). https://doi.org/10.1007/s10115-021-01549-5
Ma, Z., et al.: Automatic speech-based smoking status identification. In: Arai, K. (ed.) Science and Information Conference, pp. 193ā203. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-10467-1_11
Ma, Z., Qiu, Y., Hou, F., Wang, R., Chu, J.T.W., Bullen, C.: Determining the best acoustic features for smoker identification. In: ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8177ā8181. IEEE (2022)
Qiu, Y., Wang, R., Hou, F., Singh, S., Ma, Z., Jia, X.: Adversarial multi-task learning with inverse mapping for speech enhancement. Appl. Soft Comput. 120, 108568 (2022)
Hou, F., Wang, R., He, J., Zhou, Y.: Improving entity linking through semantic reinforced entity embeddings. arXiv preprint arXiv:2106.08495 (2021)
Tian, Y., et al.: 3D tooth instance segmentation learning objectness and affinity in point cloud. ACM Trans. Multimedia Comput. Commun. Appl. (TOMM) 18(4), 1ā16 (2022)
Liu, D., Tian, Y., Zhang, Y., Gelernter, J., Wang, X.: Heterogeneous data fusion and loss function design for tooth point cloud segmentation. Neural Comput. Appl. 1ā10 (2022)
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1ā9 (2015)
Orhan, A.E.: Robustness properties of Facebookās ResNeXt WSL models. arXiv preprint arXiv:1907.07640 (2019)
Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: ShuffleNet V2: practical guidelines for efficient CNN architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116ā131 (2018)
Tan, M., et al.: MnasNet: platform-aware neural architecture search for mobile. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2820ā2828 (2019)
Xie, S., Girshick, R., DollĆ”r, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492ā1500 (2017)
Chen, H.-Y., Su, C.-Y.: An enhanced hybrid mobilenet. In: 2018 9th International Conference on Awareness Science and Technology (iCAST), pp. 308ā312. IEEE (2018)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132ā7141 (2018)
Chandrarathne, G., Thanikasalam, K., Pinidiyaarachchi, A.: A comprehensive study on deep image classification with small datasets. In: Zakaria, Z., Ahmad, R. (eds.) Advances in Electronics Engineering. LNEE, vol. 619, pp. 93ā106. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-1289-6_9
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Wu, J., Zhang, Q., Xu, G.: Tiny imagenet challenge. Technical report (2017)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Jiang, T., Zong, M., Ma, Y., Hou, F., Wang, R. (2023). MobileACNet: ACNet-Based Lightweight Model for Image Classification. In: Yan, W.Q., Nguyen, M., Stommel, M. (eds) Image and Vision Computing. IVCNZ 2022. Lecture Notes in Computer Science, vol 13836. Springer, Cham. https://doi.org/10.1007/978-3-031-25825-1_26
Download citation
DOI: https://doi.org/10.1007/978-3-031-25825-1_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25824-4
Online ISBN: 978-3-031-25825-1
eBook Packages: Computer ScienceComputer Science (R0)