Abstract
In order to solve the problem of weak detection of small targets in traditional methods, an improved object detection algorithm is proposed. First, the six multi-scale feature maps extracted from the original SSD algorithm are fused in turn to form a new feature map with detailed information and semantic information based on the feature pyramid network and the idea of single shot multibox detector algorithm. Then, the attention model is added to the fused feature map, and the feature information of small targets can be extracted effectively. With PASCAL VOC2007 and VOC2012 as the training set, the mean average precision tested in the VOC2007 test set reached 78.3%, which is 1.1% higher than the original algorithms. In different environments, the algorithm has accurate detection effect on densely distributed small objects, and the missed detection and robustness are better than other algorithms. At the same time, the detection speed can still meet the real-time requirements.
Similar content being viewed by others
Data availability
The data that supports the findings of this study are available within the article and the code that support the findings of this study are available from the corresponding author upon reasonable request.
References
Pan, P., Schonfeld, D.: Video tracking based on sequential particle filtering on graphs. IEEE Trans. Image Process. 20(6), 1641–1651 (2011)
Huang, K.Q., Chen, X.T., Kang, Y.F., et al.: Intelligent visual surveillance: a review. Chin. J. Comput. 20(3), 1093–1118 (2015)
Qiaorong, Z., Xinyang, F.: Object tracking based on visual saliency and particle filter. J. Image Graph. 18(5), 515–522 (2013)
Yang, J., Chen, L.N., Chen, Y.S., et al.: Target detection and recognition based on depth learning. Inf. Technol. 42(10), 89–95 (2018)
Lowe, D.G.: Distinctive image features from scale-invariant key points. Int. J. Comput. Vision 60(2), 91–110 (2004)
Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vision 57(2), 137–154 (2004)
Girshick, R., Donahue, J., Darrell, T., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE. pp. 580–587 (2014)
Girshick, R.: Fast R-CNN. In: Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE. pp. 1440–1448 (2015)
Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2015)
Lin, C.F., Wang, S.D.: Fuzzy support vector machines. IEEE Trans. Neural Netw. 13(2), 464–471 (2002)
Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las-Vegas, NV, USA: IEEE, 2016:779–788.
Liu, W., Anguelov, D., Erhan, D., et al.: SSD: single shot multibox detector. In: Proceedings of the 14th European Conference on Computer Vision, pp. 21–37. Springer, Amsterdam (2016)
Lin, T. Y., Dollar, P., Girshick, R., et al.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Honolulu. pp. 936–944 (2017)
Huang, C.B., Liu, Q., Yu, S.S.: Region of interest extraction from color image based on visual saliency. J. Supercomput. 58(1), 20–33 (2011)
Jun-hong, X.U., Chun-feng, D.I.N.G., Hai-bin, S.U., et al.: Moving object extraction algorithm based on improved GVF. Comput. Eng. 38(9), 199–201 (2012)
Lee, J.H., Jang, T.J., Lee, I., et al.: Optimization and estimation of parameters for a compton camera consisting of the DSSD scatterer and the GAGG absorber with the monte carlo simulation. J. Korean Phys. Soc. 77(12), 1113–1117 (2020)
Acknowledgements
We thank the anonymous reviewers for their constructive comments. This work is supported by the National Natural Science Foundation of China (No. 61803294) and the Natural Science Foundation of Shaanxi Province, China (No. 2020JM-499, No. 2020JQ-684).
Funding
Funding was provided by the National Natural Science Foundation, China (Grant No. 61803294) and the Natural Science Foundation of Shaanxi Province, China (Grant No. 2020JM-499, No. 2020JQ-684).
Author information
Authors and Affiliations
Contributions
YW proposed the idea of this algorithm and revised the manuscript. XL drafted the first edition of this paper by analyzing new feature map with detailed information and semantic information based on the feature pyramid network and single shot multibox detector algorithm. RG conducted parts of the simulation of experiments.
Corresponding author
Ethics declarations
Conflict of interest
There are no potential competing interests in our paper. All authors have seen the manuscript and approved its submission to your journal. We confirm that the contents of the manuscript have not been published or submitted for publication elsewhere.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, Y., Liu, X. & Guo, R. An object detection algorithm based on the feature pyramid network and single shot multibox detector. Cluster Comput 25, 3313–3324 (2022). https://doi.org/10.1007/s10586-022-03560-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-022-03560-z