Skip to main content
Log in

Lightweight object detection model fused with feature pyramid

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The research field of object detection has been a hotspot in computer vision. However, most of the one-stage lightweight object detection models based on the deep convolutional neural network have the problems of many parameters. To address this problem, this paper proposes a new model named Fusion Shuffle Light Detector (FSLDet). First, based on the FSSD mode, we apply the improved lightweight Shufflenet V2 network to the FSSD model for feature extraction, where the improvement about ShuffleNet v2 is an adjustment for the network structure. Meanwhile, we adopt the bidirectional feature pyramid model to improve the feature fusion operation, which makes the fused features have more semantic information. Experiments were carried out on PASCAL VOC 2007 + 2012 dataset and helmet detection dataset. The experiment shows that the FSLDet model is superior to the state-of-the-art model in multiple evaluation criteria.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Chen P, Ho S (2016) Is overfeat useful for image-based surface defect classification tasks?[C]//Proceedings of the 2016 IEEE Inter-national Conference on Image Processing (ICIP), Phoenix, AZ, USA : 749-753

  2. Felzenszwalb P, McAllester D, Ramanan D (2008) A discriminatively trained, multiscale, deformable part Model[c]//proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA: pp 1–8

  3. Fu C, Zhu S, Chen H et al (2019) Simbnn: A similarity-aware binarized neural network acceleration framework[C]//Proceedings of the 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Com-puting Machines (FCCM), San Diego, CA, USA. 319-319

  4. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA: 580–587

  5. He K, Gkioxari G, Dollar P et al (2020) Mask r-cnn[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence 42(2):386–397

    Article  Google Scholar 

  6. He Y, Zhang X, Sun J (2017) Channel pruning for accelerating very deep neural networks[C]//Proceedings of the 2017 IEEE In-ternational Conference on Computer Vision (ICCV), Venice Italy, 1398–1406

  7. Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems, Red Hook, NY, USA: 1097..C1105

  8. Law H, Deng J (2018) Cornernet: Detecting objects as paired keypoint[C]//European Conference on Computer Vision; Springer: New York, NY, USA. 765-781

  9. Liu W, Anguelov D, Erhan D et al (2016) Ssd: Single shot multibox detector European[C]//European Conference on Computer Vision. 21-37

  10. Liu S, Huang D, Wang Y (2018) Receptive field block net for accurate and fast object detection[C]//European Conference on Computer Vision; Springer: New York, NY, USA. 404-419

  11. Liu S, Qi L, Qin H et al (2018) Path aggregation network for instance segmentation[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA. 8759-8768

  12. Lin T, Dollar P, Girshick R, et al. (2017) Feature pyramid networks for object detection[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA. 936–944

  13. Ma N, Zhang X, Zheng H et al (2018) Shufflenet v2: Practical guidelines for efficient cnn architecture design[C]//European Conference on Computer Vision; Springer: New York, NY, USA. 122-138

  14. Pang J, Chen K, Shi J et al (2019) Libra r-cnn: Towards balanced learning for object detection[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA. 821-830

  15. Redmon J, Farhadi A (2017) Yolo9000: Better, faster, stronger. Inproceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA. 6517–6525

  16. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA: 779-788

  17. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence 39(6):1137–1149

    Article  Google Scholar 

  18. Shen Z, Liu Z, Li J et al (2017) Dsod: Learning deeply supervised object detectors from scratch[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy. 1937-1945

  19. Sinha D, El-Sharkawy M (2019) Thin mobilenet: An enhanced mobilenet architecture[C]//Proceedings of 2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), New York City, NY, USA. 0280-0285

  20. Tan M, Pang R, Le QV (2020) Efficientdet: Scalable and efficient object detection[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA. 10778-10787

  21. Tian Z, Shen C, Chen H et al (2019) Fcos: Fully convolutional one-stage object detection[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South). 9626-9635

  22. Viola P, AJones M (2001) Rapid object detection using a boosted cascade of simple features[C]//Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, HI, USA, Dec 8-14: I-I

  23. Wang S, Wu L, Wu W et al (2019) Optical fiber defect detection method based on dssd network[C]//Proceedings of 2019 IEEE International Conference on Smart Internet of Things (SmartIoT), Tianjin, China. 422-426

  24. Won J, Lee D, Lee K, Lin C (2019) An improved yolov3-based neural network for de-identification technology[C]//Proceedings of 2019 34th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC), JeJu, Korea (South). 1-2

  25. Yan L, Sheng M, Wang C et al (2021) Hybrid neural networks based facial expression recognition for smart city[J]. Multimedia Tools and Applications, 1 - 24

  26. Yan L, Fu J, Wang C et al (2021) Enhanced network optimized generative adversarial network for image enhancement[J]. Multimedia Tools and Applications 80(9):14363–14381

    Article  Google Scholar 

  27. Zhai M, Liu J, Zhang W et al (2019) Multi-scale feature fusion single shot object detector based on densenet[C]//International Conference on Intelligent Robotics and Applications; Springer: New York, NY, USA. 450-460

  28. Zhang X, Zhou X, Lin M, Sun J (2018) Shufflenet: An extremely efficient convolutional neural network for mobile devices[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA. 6848-6856

  29. Zhang S, Chi C, Yao Y et al (2020) Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA. 9756-9765

Download references

Acknowledgements

Acknowledgements This work is funded by the National Natural Science Foundation of China under Grant No.61772180, the Key R&D plan of Hubei Province (2020BHB004,2020BAB012) and Natural Science Foundation of Hubei Province No.2020CFB798.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lingyu Yan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, C., Wang, Z., Li, K. et al. Lightweight object detection model fused with feature pyramid. Multimed Tools Appl 82, 601–618 (2023). https://doi.org/10.1007/s11042-022-12127-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12127-4

Keywords

Navigation