Skip to main content

MobileACNet: ACNet-Based Lightweight Model for Image Classification

  • Conference paper
  • First Online:
  • 789 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13836))

Abstract

Lightweight CNN models aim to extend the application of deep learning from conventional image classification to mobile edge device-based image classification. However, the accuracy of lightweight CNN models currently is not as comparable as traditional large CNN models. To improve the accuracy of mobile platform-based image classification, we propose MobileACNet, a novel ACNet-based lightweight model based on MobileNetV3 (a popular lightweight CNN for image classification on mobile platforms). Our model adopts a similar idea to ACNet: consider global inference and local inference adaptively to improve the classification accuracy. We improve the MobileNetV3 by replacing the inverted residual block with our proposed adaptive inverted residual module (AIR). Experimental results show that our proposed MobileACNet can effectively improve the image classification accuracy by providing additional adaptive global inference on three public datasets, i.e., Cifar-100 dataset, Tiny ImageNet dataset, and a large-scale dataset ImageNet, for mobile-platform-based image classification.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Tian, Y., et al.: Global context assisted structure-aware vehicle retrieval. IEEE Trans. Intell. Transp. Syst. (2020)

    Google Scholar 

  2. Tian, Y., Cheng, G., Gelernter, J., Shihao, Yu., Song, C., Yang, B.: Joint temporal context exploitation and active learning for video segmentation. Pattern Recogn. 100, 107158 (2020)

    Article  Google Scholar 

  3. Tian, Y., Zhang, Y., Zhou, D., Cheng, G., Chen, W.-G., Wang, R.: Triple attention network for video segmentation. Neurocomputing 417, 202–211 (2020)

    Article  Google Scholar 

  4. Jiang, L., et al.: Underwater species detection using channel sharpening attention. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 4259–4267 (2021)

    Google Scholar 

  5. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)

    Article  Google Scholar 

  6. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  7. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  8. Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)

  9. Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet-level accuracy with 50x fewer parameters and \(<\)0.5 MB model size. arXiv preprint arXiv:1602.07360 (2016)

  10. Zhou, J., Dai, H.-N., Wang, H.: Lightweight convolution neural networks for mobile edge computing in transportation cyber physical systems. ACM Trans. Intell. Syst. Technol. (TIST) 10(6), 1–20 (2019)

    Article  Google Scholar 

  11. Haque, W.A., Arefin, S., Shihavuddin, A.S.M., Hasan, M.A.: Deepthin: a novel lightweight CNN architecture for traffic sign recognition without GPU requirements. Expert Syst. Appl. 168, 114481 (2021)

    Article  Google Scholar 

  12. Valueva, M.V., Nagornov, N.N., Lyakhov, P.A., Valuev, G.V., Chervyakov, N.I.: Application of the residue number system to reduce hardware costs of the convolutional neural network implementation. Math. Comput. Simul. 177, 232–243 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  13. He, Y., Li, T.: A lightweight CNN model and its application in intelligent practical teaching evaluation. In: MATEC Web of Conferences, vol. 309, p. 05016. EDP Sciences (2020)

    Google Scholar 

  14. Luo, J.-H., Wu, J., Lin, W.: Thinet: a filter level pruning method for deep neural network compression. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5058–5066 (2017)

    Google Scholar 

  15. Shipeng, F., Li, Z., Liu, Z., Yang, X.: Interactive knowledge distillation for image classification. Neurocomputing 449, 411–421 (2021)

    Article  Google Scholar 

  16. Kaiser, L., Gomez, A.N., Chollet, F.: Depthwise separable convolutions for neural machine translation. arXiv preprint arXiv:1706.03059 (2017)

  17. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)

    Google Scholar 

  18. Howard, A., et al.: Searching for MobileNetV3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314–1324 (2019)

    Google Scholar 

  19. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision And Pattern Recognition, pp. 4510–4520 (2018)

    Google Scholar 

  20. Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  21. Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C.: Ghostnet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580–1589 (2020)

    Google Scholar 

  22. Mao, G., Anderson, B.D.O.: Towards a better understanding of large-scale network models. IEEE/ACM Trans. Netw. 20(2), 408–421 (2011)

    Article  Google Scholar 

  23. Hosseini, H., Xiao, B., Jaiswal, M., Poovendran, R.: On the limitation of convolutional neural networks in recognizing negative images. In: 2017 16th IEEE International Conference On Machine Learning And Applications (ICMLA), pp. 352–358. IEEE (2017)

    Google Scholar 

  24. Dua, A., Li, Y., Ren, F.: Systolic-CNN: an OpenCL-defined scalable run-time-flexible FPGA accelerator architecture for accelerating convolutional neural network inference in cloud/edge computing. In: 2020 IEEE 28th Annual International Symposium on Field-programmable Custom Computing Machines (FCCM), p. 231. IEEE (2020)

    Google Scholar 

  25. Battaglia, P.W., et al.: Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261 (2018)

  26. Zou, S., Chen, W., Chen, H.: Image classification model based on deep learning in internet of things. Wirel. Commun. Mob. Comput. 2020 (2020)

    Google Scholar 

  27. Wang, G., Wang, K., Lin, L.: Adaptively connected neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1781–1790 (2019)

    Google Scholar 

  28. Liu, T., Ma, Y., Yang, W., Ji, W., Wang, R., Jiang, P.: Spatial-temporal interaction learning based two-stream network for action recognition. Inf. Sci. (2022)

    Google Scholar 

  29. Zong, M., Wang, R., Chen, Z., Wang, M., Wang, X., Potgieter, J.: Multi-cue based 3D residual network for action recognition. Neural Comput. Appl. 33(10), 5167–5181 (2021)

    Article  Google Scholar 

  30. Ji, W., Wang, R., Tian, Y., Wang, X.: An attention based dual learning approach for video captioning. Appl. Soft Comput. 117, 108332 (2022)

    Article  Google Scholar 

  31. Ji, W., Wang, R.: A multi-instance multi-label dual learning approach for video captioning. ACM Trans. Multimidia Comput. Commun. Appl. 17(2s), 1–18 (2021)

    Google Scholar 

  32. Zong, M., Wang, R., Chen, X., Chen, Z., Gong, Y.: Motion saliency based multi-stream multiplier resnets for action recognition. Image Vision Comput. 107, 104108 (2021)

    Article  Google Scholar 

  33. Chen, Z., Wang, R., Zhang, Z., Wang, H., Lizhong, X.: Background-foreground interaction for moving object detection in dynamic scenes. Inf. Sci. 483, 65–81 (2019)

    Article  Google Scholar 

  34. Jing, C., Potgieter, J., Noble, F., Wang, R.: A comparison and analysis of RGB-D cameras’ depth performance for robotics application. In: 2017 24th International Conference on Mechatronics and Machine Vision in Practice (M2VIP), pp. 1–6. IEEE (2017)

    Google Scholar 

  35. Wang, L., et al.: Multi-cue based four-stream 3D resnets for video-based action recognition. Inf. Sci. 575, 654–665 (2021)

    Article  MathSciNet  Google Scholar 

  36. Liu, Z., Li, Z., Wang, R., Zong, M., Ji, W.: Spatiotemporal saliency-based multi-stream networks with attention-aware LSTM for action recognition. Neural Comput. Appl. 32(18), 14593–14602 (2020)

    Article  Google Scholar 

  37. Shamsolmoali, P., et al.: Image synthesis with adversarial networks: a comprehensive survey and case studies. In. Fusion 72, 126–146 (2021)

    Article  Google Scholar 

  38. Hou, F., Wang, R., He, J., Zhou, Y.: Improving entity linking through semantic reinforced entity embeddings. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 6843–6848. Association for Computational Linguistics (2020)

    Google Scholar 

  39. Hou, F., Wang, R., Zhou, Y.: Transfer learning for fine-grained entity typing. Knowl. Inf. Syst. 63(4), 845–866 (2021). https://doi.org/10.1007/s10115-021-01549-5

    Article  Google Scholar 

  40. Ma, Z., et al.: Automatic speech-based smoking status identification. In: Arai, K. (ed.) Science and Information Conference, pp. 193–203. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-10467-1_11

    Chapter  Google Scholar 

  41. Ma, Z., Qiu, Y., Hou, F., Wang, R., Chu, J.T.W., Bullen, C.: Determining the best acoustic features for smoker identification. In: ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8177–8181. IEEE (2022)

    Google Scholar 

  42. Qiu, Y., Wang, R., Hou, F., Singh, S., Ma, Z., Jia, X.: Adversarial multi-task learning with inverse mapping for speech enhancement. Appl. Soft Comput. 120, 108568 (2022)

    Article  Google Scholar 

  43. Hou, F., Wang, R., He, J., Zhou, Y.: Improving entity linking through semantic reinforced entity embeddings. arXiv preprint arXiv:2106.08495 (2021)

  44. Tian, Y., et al.: 3D tooth instance segmentation learning objectness and affinity in point cloud. ACM Trans. Multimedia Comput. Commun. Appl. (TOMM) 18(4), 1–16 (2022)

    Article  Google Scholar 

  45. Liu, D., Tian, Y., Zhang, Y., Gelernter, J., Wang, X.: Heterogeneous data fusion and loss function design for tooth point cloud segmentation. Neural Comput. Appl. 1–10 (2022)

    Google Scholar 

  46. Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

    Google Scholar 

  47. Orhan, A.E.: Robustness properties of Facebook’s ResNeXt WSL models. arXiv preprint arXiv:1907.07640 (2019)

  48. Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: ShuffleNet V2: practical guidelines for efficient CNN architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018)

    Google Scholar 

  49. Tan, M., et al.: MnasNet: platform-aware neural architecture search for mobile. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2820–2828 (2019)

    Google Scholar 

  50. Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)

    Google Scholar 

  51. Chen, H.-Y., Su, C.-Y.: An enhanced hybrid mobilenet. In: 2018 9th International Conference on Awareness Science and Technology (iCAST), pp. 308–312. IEEE (2018)

    Google Scholar 

  52. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

    Google Scholar 

  53. Chandrarathne, G., Thanikasalam, K., Pinidiyaarachchi, A.: A comprehensive study on deep image classification with small datasets. In: Zakaria, Z., Ahmad, R. (eds.) Advances in Electronics Engineering. LNEE, vol. 619, pp. 93–106. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-1289-6_9

    Chapter  Google Scholar 

  54. Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)

    Google Scholar 

  55. Wu, J., Zhang, Q., Xu, G.: Tiny imagenet challenge. Technical report (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ming Zong or Feng Hou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jiang, T., Zong, M., Ma, Y., Hou, F., Wang, R. (2023). MobileACNet: ACNet-Based Lightweight Model for Image Classification. In: Yan, W.Q., Nguyen, M., Stommel, M. (eds) Image and Vision Computing. IVCNZ 2022. Lecture Notes in Computer Science, vol 13836. Springer, Cham. https://doi.org/10.1007/978-3-031-25825-1_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25825-1_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25824-4

  • Online ISBN: 978-3-031-25825-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics