Skip to main content
Log in

RBS-YOLO: a vehicle detection algorithm based on multi-scale feature extraction

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

In autonomous driving, vehicles are recognized by computer vision and image processing to reduce the risk of accidents. However, traditional vehicle detection algorithms often struggle to deal with the vehicle occlusion problem effectively, necessitating the modification of feature map size as vehicle sizes vary. To address these issues, we proposed RBS-YOLO, a vehicle detection model based on YOLOv5. First, the ResFusion module was designed to expand the model’s receptive field and capture features at various scales. Second, using a bidirectional feature pyramid network, we enhanced the inclusiveness of feature information by fusing them. Finally, the SloU loss function was used instead of the CloU loss function to improve the network’s positioning accuracy and convergence speed. The experimental results indicate that the RBS-YOLO model achieves a precision rate of 97.5\(\%\) and 72.5\(\%\) on the UA-DETRAC dataset and BDD-100K dataset, exceeding YOLOv5 by 1.1\(\%\) and 1.7\(\%\).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

No datasets were generated or analyzed during the current study.

Code availability

Data will be made available on reasonable request.

References

  1. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  2. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)

    Article  Google Scholar 

  3. Tian, S., Bhattacharya, U., Lu, S., Su, B., Wang, Q., Wei, X., Lu, Y., Tan, C.L.: Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recognit 51, 125–134 (2016)

    Article  Google Scholar 

  4. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, vol. 1 (2001)

  5. Matthews, N., An, P., Charnley, D., Harris, C.: Vehicle detection and recognition in greyscale imagery. Control Eng. Pract. 4(4), 473–479 (1996)

    Article  Google Scholar 

  6. Schneiderman, H., Kanade, T.: A statistical method for 3d object detection applied to faces and cars. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No. PR00662), vol. 1, pp. 746–751 (2000)

  7. Cheng, W.-C., Jhan, D.-M.: A self-constructing cascade classifier with AdaBoost and SVM for pedestriandetection. Eng. Appl. Artif. Intell. 26(3), 1016–1028 (2013)

    Article  Google Scholar 

  8. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)

  9. Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)

  10. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, vol. 28 (2015)

  11. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

  12. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: SSD: single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp. 21–37 (2016)

  13. Wei, Y., Tian, Q., Guo, J., Huang, W., Cao, J.: Multi-vehicle detection algorithm through combining harr and hog features. Math. Comput. Simul. 155, 130–145 (2019)

    Article  MathSciNet  Google Scholar 

  14. Razalli, H., Ramli, R., Alkawaz, M.H.: Emergency vehicle recognition and classification method using HSV color segmentation. In: 2020 16th IEEE International Colloquium on Signal Processing & its Applications (CSPA), pp. 284–289 (2020)

  15. Liang, H., Yang, Z., Shi, F., Yang, R.: Energy and width features-based SVM for vehicles classification using low power consumption radar. In: 2020 IEEE 3rd International Conference on Electronic Information and Communication Technology (ICEICT), pp. 129–134 (2020)

  16. Thike, L.L., Thein, T.L.L.: Vehicle detection using upper local ternary features with SVM classification. In: 2023 IEEE Conference on Computer Applications (ICCA), pp. 282–287. IEEE (2023)

  17. Ghosh, R.: On-road vehicle detection in varying weather conditions using faster R-CNN with several region proposal networks. Multimed. Tools Appl. 80(17), 25985–25999 (2021)

    Article  Google Scholar 

  18. Chen, Z., Guo, H., Yang, J., Jiao, H., Feng, Z., Chen, L., Gao, T.: Fast vehicle detection algorithm in traffic scene based on improved SSD. Measurement 201, 111655 (2022)

    Article  Google Scholar 

  19. Dong, X., Yan, S., Duan, C.: A lightweight vehicles detection network model based on yolov5. Eng. Appl. Artif. Intell. 113, 104914 (2022)

    Article  Google Scholar 

  20. Bie, M., Liu, Y., Li, G., Hong, J., Li, J.: Real-time vehicle detection algorithm based on a lightweight you-only-look-once (yolov5n-l) approach. Expert Syst. Appl. 213, 119108 (2023)

    Article  Google Scholar 

  21. Kang, L., Lu, Z., Meng, L., Gao, Z.: Yolo-fa: type-1 fuzzy attention based yolo detector for vehicle detection. Expert Syst. Appl. 237, 121209 (2024)

    Article  Google Scholar 

  22. Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)

  23. Tan, M., Pang, R., Le, Q.V.: Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020)

  24. Xie, Z., Wang, S., Zhao, W., Guo, Z.: A robust context attention network for human hand detection. Expert Syst. Appl. 208, 118132 (2022)

    Article  Google Scholar 

  25. Gao, J., Geng, X., Zhang, Y., Wang, R., Shao, K.: Augmented weighted bidirectional feature pyramid network for marine object detection. Expert Syst. Appl. 237, 121688 (2024)

    Article  Google Scholar 

  26. Chen, Y., Zhu, X., Li, Y., Wei, Y., Ye, L.: Enhanced semantic feature pyramid network for small object detection. Signal Process. Image Commun. 113, 116919 (2023)

    Article  Google Scholar 

  27. Chen, S., Zhao, J., Zhou, Y., Wang, H., Yao, R., Zhang, L., Xue, Y.: Info-fpn: an informative feature pyramid network for object detection in remote sensing images. Expert Syst. Appl. 214, 119132 (2023)

  28. Shao, L., Zhang, E., Duan, J., Ma, Q.: Enriched multi-scale cascade pyramid features and guided context attention network for industrial surface defect detection. Eng. Appl. Artif. Intell. 123, 106369 (2023)

    Article  Google Scholar 

  29. Gao, S.-H., Cheng, M.-M., Zhao, K., Zhang, X.-Y., Yang, M.-H., Torr, P.: Res2net: a new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 652–662 (2019)

    Article  Google Scholar 

  30. Gevorgyan, Z.: Siou loss: more powerful learning for bounding box regression (2022). arXiv preprint arXiv:2205.12740

  31. Wen, L., Du, D., Cai, Z., Lei, Z., Chang, M.-C., Qi, H., Lim, J., Yang, M.-H., Lyu, S.: Ua-detrac: a new benchmark and protocol for multi-object detection and tracking. Comput. Vis. Image Underst. 193, 102907 (2020)

    Article  Google Scholar 

  32. Yu, F., Chen, H., Wang, X., Xian, W., Chen, Y., Liu, F., Madhavan, V., Darrell, T.: Bdd100k: a diverse driving dataset for heterogeneous multitask learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2636–2645 (2020)

  33. Redmon, J., Farhadi, A.: Yolov3: an incremental improvement (2018). arXiv preprint arXiv:1804.02767

  34. Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M.: Yolov4: optimal speed and accuracy of object detection (2020). arXiv preprint arXiv:2004.10934

  35. Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: Yolox: exceeding yolo series in 2021 (2021). arXiv preprint arXiv:2107.08430

Download references

Acknowledgements

This work was supported by the Fujian Province Nature Science Foundation under Grant No.2020J01813 and No.2021J011002, the Research Project on Education and Teaching Reform of Undergraduate Colleges and Universities in Fujian Province under Grant No.FBJG20210070 and No.FBJY20230170, and the 2022 Annual Project of the Fourteenth Five-Year Plan for Fujian Educational Science under Grant No.FJJKBK22-173.

Funding

Natural Science Foundation of Fujian Province (2020J01813), The Research Project on Education and Teaching Reform of Undergraduate Colleges and Universities in Fujian Province (FBJG20210070, FBJY20230170), The 2022 Annual Project of the Fourteenth Five-Year Plan for Fujian Educational Science (FJJKBK22-173).

Author information

Authors and Affiliations

Authors

Contributions

JHR contributed to conceptualization, methodology, software, investigation, formal analysis, and writing—original draft. JMY contributed to conceptualization, funding acquisition, and writing—review and editing. WJZ contributed to funding acquisition and writing—review and editing. KHC contributed to formal analysis and methodology. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Jingmin Yang.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ren, J., Yang, J., Zhang, W. et al. RBS-YOLO: a vehicle detection algorithm based on multi-scale feature extraction. SIViP 18, 3421–3430 (2024). https://doi.org/10.1007/s11760-024-03007-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-024-03007-5

Keywords

Navigation