Abstract
In autonomous driving, vehicles are recognized by computer vision and image processing to reduce the risk of accidents. However, traditional vehicle detection algorithms often struggle to deal with the vehicle occlusion problem effectively, necessitating the modification of feature map size as vehicle sizes vary. To address these issues, we proposed RBS-YOLO, a vehicle detection model based on YOLOv5. First, the ResFusion module was designed to expand the model’s receptive field and capture features at various scales. Second, using a bidirectional feature pyramid network, we enhanced the inclusiveness of feature information by fusing them. Finally, the SloU loss function was used instead of the CloU loss function to improve the network’s positioning accuracy and convergence speed. The experimental results indicate that the RBS-YOLO model achieves a precision rate of 97.5\(\%\) and 72.5\(\%\) on the UA-DETRAC dataset and BDD-100K dataset, exceeding YOLOv5 by 1.1\(\%\) and 1.7\(\%\).
Similar content being viewed by others
Data availability
No datasets were generated or analyzed during the current study.
Code availability
Data will be made available on reasonable request.
References
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)
Tian, S., Bhattacharya, U., Lu, S., Su, B., Wang, Q., Wei, X., Lu, Y., Tan, C.L.: Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recognit 51, 125–134 (2016)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, vol. 1 (2001)
Matthews, N., An, P., Charnley, D., Harris, C.: Vehicle detection and recognition in greyscale imagery. Control Eng. Pract. 4(4), 473–479 (1996)
Schneiderman, H., Kanade, T.: A statistical method for 3d object detection applied to faces and cars. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No. PR00662), vol. 1, pp. 746–751 (2000)
Cheng, W.-C., Jhan, D.-M.: A self-constructing cascade classifier with AdaBoost and SVM for pedestriandetection. Eng. Appl. Artif. Intell. 26(3), 1016–1028 (2013)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, vol. 28 (2015)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: SSD: single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp. 21–37 (2016)
Wei, Y., Tian, Q., Guo, J., Huang, W., Cao, J.: Multi-vehicle detection algorithm through combining harr and hog features. Math. Comput. Simul. 155, 130–145 (2019)
Razalli, H., Ramli, R., Alkawaz, M.H.: Emergency vehicle recognition and classification method using HSV color segmentation. In: 2020 16th IEEE International Colloquium on Signal Processing & its Applications (CSPA), pp. 284–289 (2020)
Liang, H., Yang, Z., Shi, F., Yang, R.: Energy and width features-based SVM for vehicles classification using low power consumption radar. In: 2020 IEEE 3rd International Conference on Electronic Information and Communication Technology (ICEICT), pp. 129–134 (2020)
Thike, L.L., Thein, T.L.L.: Vehicle detection using upper local ternary features with SVM classification. In: 2023 IEEE Conference on Computer Applications (ICCA), pp. 282–287. IEEE (2023)
Ghosh, R.: On-road vehicle detection in varying weather conditions using faster R-CNN with several region proposal networks. Multimed. Tools Appl. 80(17), 25985–25999 (2021)
Chen, Z., Guo, H., Yang, J., Jiao, H., Feng, Z., Chen, L., Gao, T.: Fast vehicle detection algorithm in traffic scene based on improved SSD. Measurement 201, 111655 (2022)
Dong, X., Yan, S., Duan, C.: A lightweight vehicles detection network model based on yolov5. Eng. Appl. Artif. Intell. 113, 104914 (2022)
Bie, M., Liu, Y., Li, G., Hong, J., Li, J.: Real-time vehicle detection algorithm based on a lightweight you-only-look-once (yolov5n-l) approach. Expert Syst. Appl. 213, 119108 (2023)
Kang, L., Lu, Z., Meng, L., Gao, Z.: Yolo-fa: type-1 fuzzy attention based yolo detector for vehicle detection. Expert Syst. Appl. 237, 121209 (2024)
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Tan, M., Pang, R., Le, Q.V.: Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020)
Xie, Z., Wang, S., Zhao, W., Guo, Z.: A robust context attention network for human hand detection. Expert Syst. Appl. 208, 118132 (2022)
Gao, J., Geng, X., Zhang, Y., Wang, R., Shao, K.: Augmented weighted bidirectional feature pyramid network for marine object detection. Expert Syst. Appl. 237, 121688 (2024)
Chen, Y., Zhu, X., Li, Y., Wei, Y., Ye, L.: Enhanced semantic feature pyramid network for small object detection. Signal Process. Image Commun. 113, 116919 (2023)
Chen, S., Zhao, J., Zhou, Y., Wang, H., Yao, R., Zhang, L., Xue, Y.: Info-fpn: an informative feature pyramid network for object detection in remote sensing images. Expert Syst. Appl. 214, 119132 (2023)
Shao, L., Zhang, E., Duan, J., Ma, Q.: Enriched multi-scale cascade pyramid features and guided context attention network for industrial surface defect detection. Eng. Appl. Artif. Intell. 123, 106369 (2023)
Gao, S.-H., Cheng, M.-M., Zhao, K., Zhang, X.-Y., Yang, M.-H., Torr, P.: Res2net: a new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 652–662 (2019)
Gevorgyan, Z.: Siou loss: more powerful learning for bounding box regression (2022). arXiv preprint arXiv:2205.12740
Wen, L., Du, D., Cai, Z., Lei, Z., Chang, M.-C., Qi, H., Lim, J., Yang, M.-H., Lyu, S.: Ua-detrac: a new benchmark and protocol for multi-object detection and tracking. Comput. Vis. Image Underst. 193, 102907 (2020)
Yu, F., Chen, H., Wang, X., Xian, W., Chen, Y., Liu, F., Madhavan, V., Darrell, T.: Bdd100k: a diverse driving dataset for heterogeneous multitask learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2636–2645 (2020)
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement (2018). arXiv preprint arXiv:1804.02767
Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M.: Yolov4: optimal speed and accuracy of object detection (2020). arXiv preprint arXiv:2004.10934
Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: Yolox: exceeding yolo series in 2021 (2021). arXiv preprint arXiv:2107.08430
Acknowledgements
This work was supported by the Fujian Province Nature Science Foundation under Grant No.2020J01813 and No.2021J011002, the Research Project on Education and Teaching Reform of Undergraduate Colleges and Universities in Fujian Province under Grant No.FBJG20210070 and No.FBJY20230170, and the 2022 Annual Project of the Fourteenth Five-Year Plan for Fujian Educational Science under Grant No.FJJKBK22-173.
Funding
Natural Science Foundation of Fujian Province (2020J01813), The Research Project on Education and Teaching Reform of Undergraduate Colleges and Universities in Fujian Province (FBJG20210070, FBJY20230170), The 2022 Annual Project of the Fourteenth Five-Year Plan for Fujian Educational Science (FJJKBK22-173).
Author information
Authors and Affiliations
Contributions
JHR contributed to conceptualization, methodology, software, investigation, formal analysis, and writing—original draft. JMY contributed to conceptualization, funding acquisition, and writing—review and editing. WJZ contributed to funding acquisition and writing—review and editing. KHC contributed to formal analysis and methodology. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ren, J., Yang, J., Zhang, W. et al. RBS-YOLO: a vehicle detection algorithm based on multi-scale feature extraction. SIViP 18, 3421–3430 (2024). https://doi.org/10.1007/s11760-024-03007-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-024-03007-5