Skip to main content
Log in

L-Net: lightweight and fast object detector-based ShuffleNetV2

  • Original Research Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

Object detection algorithms based on deep learning have made continuous progress in recent years. On the premise of ensuring the accuracy of object detection, reducing model complexity and improving detection speed have always been the goals pursued by current object detection algorithms. A lightweight object detection model its backbone based on ShuffleNetV2 network structure named L-Net is presented in this paper. A suitable backbone network was obtained by changing from 3 × 3 depth convolution to 5 × 5 depth convolution and reducing the number of input channels. In order to obtain a more discriminative image feature description, Pyramid Pooling Module and Attention Pyramid Module are added after the backbone network. Experimental results show that the L-Net model only uses 1.54B FLOPs (floating point operations) to achieve 70.2% mAP (mean average precision) on PASCAL VOC2007 and 21.8% mAP on the MS COCO dataset. The model has achieved competitive results in terms of accuracy and speed while being lightweight.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig.5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Redmon J, Divvala S, Girshick R, et al.:You only look once: unified, real-time object detection. Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788 (2016).

  2. Redmon J, Farhadi A.: YOLO9000: better, faster, stronger. Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7263–7271 (2017).

  3. Redmon, J., & Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018).

  4. Liu W, Anguelov D, Erhan D, et al.: Ssd: single shot multibox detector. European conference on computer vision. Springer, Cham, pp. 21–37 (2016).

  5. Law H, Deng J.: Cornernet: detecting objects as paired keypoints. Proceedings of the European conference on computer vision (ECCV), pp. 734–750 (2018).

  6. Girshick R, Donahue J, Darrell T, et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587 (2014).

  7. Girshick R.: Fast r-cnn. Proceedings of the IEEE international conference on computer vision, pp. 1440–1448 (2015).

  8. Ren, S., He, K., Girshick, R., & Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. arXiv preprint arXiv:1506.01497 (2015).

  9. He K, Zhang X, Ren S, et al.: Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016).

  10. Lin T Y, Dollár P, Girshick R, et al.: Feature pyramid networks for object detection. Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117–2125 (2017).

  11. Lin T Y, Goyal P, Girshick R, et al.: Focal loss for dense object detection. Proceedings of the IEEE international conference on computer vision, pp. 2980–2988 (2017).

  12. Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).

  13. Li, Y., Li, J., Lin, W., Li, J.: Tiny-DSOD: Lightweight object detection for resource-restricted usages. arXiv preprint arXiv:1807.11013 (2018).

  14. Wang, R. J., Li, X., Ling, C. X.: Pelee: A real-time object detection system on mobile devices. arXiv preprint arXiv:1804.06882 (2018).

  15. Leng, L., Zhang, J.: Palmhash code vs. palmphasor code. Neurocomputing 108, 1–12 (2013)

    Article  Google Scholar 

  16. Zhang, Y., Chu, J., Leng, L., et al.: Mask-refined R-CNN: a network for refining object details in instance segmentation. Sensors 20(4), 1010 (2020)

    Article  Google Scholar 

  17. Chu, J., Guo, Z., Leng, L.: Object detection based on multi-layer convolution feature fusion and online hard example mining. IEEE Access 6, 19959–19967 (2018)

    Article  Google Scholar 

  18. Liu, S., Huang, D., Wang, Y.: Learning spatial fusion for single-shot object detection. arXiv preprint arXiv:1911.09516 (2019).

  19. Leng, L., Li, M., Kim, C., et al.: Dual-source discrimination power analysis for multi-instance contactless palmprint recognition. Multimedia Tools and Applications 76(1), 333–354 (2017)

    Article  Google Scholar 

  20. Ma N, Zhang X, Zheng H T, et al.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. Proceedings of the European conference on computer vision (ECCV), pp. 116–131 (2018).

  21. He, K., Zhang, X., Ren, S., et al.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)

    Article  Google Scholar 

  22. Erhan D, Szegedy C, Toshev A, et al.: Scalable object detection using deep neural networks. Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2147–2154 (2014).

  23. Tian Z, Shen C, Chen H, et al.: Fcos: fully convolutional one-stage object detection. Proceedings of the IEEE/CVF international conference on computer vision, pp. 9627–9636 (2019).

  24. Sermanet P, Eigen D, Zhang X, et al.: Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013).

  25. Liu Z, Sun M, Zhou T, et al.: Rethinking the value of network pruning. arXiv preprint arXiv:1810.05270 (2018).

  26. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

  27. Courbariaux M, Bengio Y, David J. P.: Binaryconnect: Training deep neural networks with binary weights during propagations. arXiv preprint arXiv:1511.00363 (2015).

  28. Hubara I, Courbariaux M, Soudry D, et al.: Binarized neural networks. Proceedings of the 30th international conference on neural information processing systems, pp. 4114–4122 (2016).

  29. Hinton G, Vinyals O, Dean J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).

  30. Li Z, Peng C, Yu G, et al.: Light-head r-cnn: in defense of two-stage object detector. arXiv preprint arXiv:1711.07264 (2017).

  31. Peng C, Zhang X, Yu G, et al.: Large kernel matters--improve semantic segmentation by global convolutional network[C]//Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4353–4361 (2017).

  32. Zhang X, Zhou X, Lin M, et al.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6848–6856 (2018).

  33. Sandler M, Howard A, Zhu M, et al.: Mobilenetv2: inverted residuals and linear bottlenecks. Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510–4520 (2018).

  34. Howard A, Sandler M, Chu G, et al.: Searching for mobilenetv3. Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314–1324 (2019).

  35. Chollet F.: Xception: deep learning with depthwise separable convolutions. Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1251–1258 (2017).

Download references

Acknowledgements

This work is supported by Ministry of Education, China (201901055015), Natural Science Foundation of Shandong Province (ZR2020KE023) and Shandong University of Science and Technology (JXTD20170503).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jin Han.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Han, J., Yang, Y. L-Net: lightweight and fast object detector-based ShuffleNetV2. J Real-Time Image Proc 18, 2527–2538 (2021). https://doi.org/10.1007/s11554-021-01145-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-021-01145-4

Keywords

Navigation