Skip to main content
Log in

YOLOv3-MT: A YOLOv3 using multi-target tracking for vehicle visual detection

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

During automatic driving, the complex background and mutual occlusion between multiple targets hinder the correct judgment of the detector and miss detection. When a close-range target is captured again, the vehicle may not be able to respond in time and cause a fatal accident. Therefore, in the application of auxiliary systems, a model that can accurately identify partially occluded targets in complex backgrounds and perform short-term tracking and early warning of completely occluded objects is required. This paper proposes a method to improve detection accuracy while supporting real-time operations based on YOLOv3 and realize real-time warnings for those objects that are completely blocked. First, we obtain a more suitable prior frames setting through class-wise K-means clustering. To solve the problem that the maxpool operation of original CBAM easily introduces background noise, we proposed AS-CBAM(Adaptive Selection Convolutional Block Attention Module) and innovatively combined the HDC(Hybrid Dilated Convolution) to maximize the receptive field and fine-tune the characteristics. The 1×1 convolution operation is used to suppress the increase of the parameter amount. In this study, DIOU-NMS was used to replace traditional NMS. Besides, a tracking algorithm based on Kalman filtering and Hungarian matching is introduced to improve the system’s ability to recognize occluded objects. Compared with the traditional YOLOv3, the proposed method can increase the mAP by 1.32% and 1.47% on KITTI and UA-DETRAC, respectively. Nevertheless, it shows a processing speed of 35.07FPS and a more significant improvement in accuracy (90.36% vs. 85.71%) on the Object-Mask, a dataset that focuses on occlusion conditions. Therefore, the proposed algorithm is more suitable for autonomous driving applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Mao Q, Sun H, Liu Y (2019) Mini-YOLOv3: real-time object detector for embedded applications. IEEE Access 7:133529–133538

    Article  Google Scholar 

  2. Wu X, Chen H, Chen C, Zhong M, Xie S, Guo Y, Fujita H (2020) The autonomous navigation and obstacle avoidance for USVs with ANOA deep reinforcement learning method. Knowl-Based Syst 105590

  3. Gao P, Zhang Q, Wang F, Xiao L, Fujita H, Zhang Y (2020) Learning reinforced attentional representation for end-to-end visual tracking. arXiv preprint arXiv:1908.10009

  4. Zhang Y, Zhou Y, Lu H, Fujita H (2020) Traffic network flow prediction using parallel training for deep convolutional neural networks on spark cloud. IEEE Transactions on Industrial Informatics. https://doi.org/10.1109/TII.2020.2976053

  5. Bichen W, Iandola F, Jin Peter H, Keutzer K (2017) Squeezedet: Unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 129–137

  6. Chi Z, Yuehu L, Danchen Z, Yuanqi S (2014) Road-view: A traffic scene simulator for autonomous vehicle simulation testing. In: 17th International IEEE conference on intelligent transportation systems (ITSC), IEEE, pp 1160–1165

  7. Junqing W, Snider Jarrod M, Junsung K, Dolan John M, Rajkumar R, Litkouhi B (2013) Towards a viable autonomous driving research platform. In: 2013 IEEE intelligent vehicles symposium (IV), IEEE, pp 763–770

  8. Gao T, Liu Z, Yue S, Zhang J (2010) Moving vehicle tracking algorithm used for intelligent traffic China. J Highway Transport 23(3):89–94

    Google Scholar 

  9. Teoh SS, Bräunl T (2012) Symmetry-based monocular vehicle detection system. Mach Vis Appl 23:831–842

    Article  Google Scholar 

  10. Felzenszwalb P, McAllester D, Ramanan D (2008) A discriminatively trained multiscale deformable part model. In: IEEE conference on computer vision and pattern recognition (CVPR)

  11. Felzenszwalb P, Girshick R, McAllester D, Ramanan D (2010) Object detection with discriminatively trained partbased models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645

    Article  Google Scholar 

  12. Karaimer H, Baris BY (2017) Detection and classification of vehicles from omnidirectional videosusing multiple silhouettes. Pattern Anal Applic 20(3):893–905

    Article  Google Scholar 

  13. Ershadi N, Menéndez J, Jiménez D (2018) Robust vehicle detection in different weather conditions: using MIPM. PLoS One 13:e0191355

    Article  Google Scholar 

  14. Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448

  15. Shaoqing R, He K, Girshick R, Jian S (2015) r-cnn: Faster Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99

  16. Jifeng D, Li Y, He K, Jian S (2016) R-fcn: Object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387

  17. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. In: Proc IEEE international conference on computer vision (ICCV). pp 2961–2969

  18. Wei L, Anguelov D, Erhan D, Szegedy C, Reed S, Cheng-Yang F, Alexander CB (2016) Ssd: Single shot multibox detector. In: European conference on computer vision, Springer, pp 21–37

  19. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788

  20. Zhaowei C, Quanfu F, Rogerio SF, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: European conference on computer vision, Springer, pp 354–370

  21. Xiaowei H, Xuemiao X, Yongjie X, Hao C, He S, Jing Q, Pheng-Ann H (2019) Sinet: A scale-insensitive convolutional neural network for fast vehicle detection. IEEE Trans Intell Transp Syst 20(3):1010–1019

    Article  Google Scholar 

  22. Qijie Z, Yongtao W, Tao T, Zhi T (2018) Comprehensive feature enhancement module for single-shot object detector. In: Asian conference on computer vision, Springer

  23. Shifeng Z, Longyin W, Xiao B, Zhen L, Li Stan Z (2018) Single-shot refinement neural network for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4203–4212

  24. Songtao L, Di H et al (2018) Receptive field block net for accurate and fast object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 385–400

  25. Redmon J, Farhadi A (2017) YOLO9000: Better faster stronger. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 7263–7271

  26. Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. https://arxiv.org/abs/1804.02767

  27. Liu C, Huynh Q, Sun Y, Reynolds M, Atkinson S (2020) A vision-based pipeline for vehicle counting, speed estimation, and classification. IEEE Trans Intell Transp Syst, pp 1–14

  28. Mao QC, Sun HM, Zuo LQ, et al. (2020) Finding every car: a traffic surveillance multi-scale vehicle object detection method. Appl Intell 50:3125–3136. https://doi.org/10.1007/s10489-020-01704-5

    Article  Google Scholar 

  29. Harikrishnan PM, Thomas A, Gopi VP et al (2021) Inception single shot multi-box detector with affinity propagation clustering and their application in multi-class vehicle counting. Appl Intell. https://doi.org/10.1007/s10489-020-02127-y

  30. Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? the kitti vision benchmark suite. In: 2012 IEEE conference on computer vision and pattern recognition, IEEE, pp 3354–3361

  31. Wei S, Chen H, Zhu X, Zhang H (2020) Ship detection in remote sensing image based on faster R-CNN with dilated convolution. In: 2020 39th Chinese Control Conference (CCC) Shenyang, China, pp 7148–7153. https://doi.org/10.23919/CCC50068.2020.9189467

  32. Kim K, Kim P, Chung Y, Choi D (2019) Multi-Scale Detector for accurate vehicle detection in traffic surveillance data. IEEE Access 7:78311–78319. https://doi.org/10.1109/ACCESS.2019.2922479

    Article  Google Scholar 

  33. Hong F, Lu C, Liu C, Liu R, Wei J (2020) A traffic surveillance Multi-Scale vehicle detection object method base on Encoder-Decoder. IEEE Access 8:47664–47674. https://doi.org/10.1109/ACCESS.2020.2979260

    Article  Google Scholar 

  34. Zhao S, You F (2020) Vehicle detection based on improved yolov3 algorithm. In: 2020 international conference on intelligenttransportation, big data & smart city (ICITBS), vientian, Laos, pp 76–79. https://doi.org/10.1109/ICITBS49701.2020.00024

  35. Yu F, Koltun V (2015) Multi-Scale context aggregation by dilated convolutions

  36. Wandell BA, Winawer J (2015) Computational neuroimaging and population receptive fields[J]. Trends in Cognitive Sciences 19(6):349–357

    Article  Google Scholar 

  37. Wu B, Iandola F, Jin Peter H, Keutzer K (2017) Squeezedet: Unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 129–137

  38. Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) CenterNet: Keypoint triplets for object detection. In: 2019 IEEE/CVF international conference on computer vision (ICCV), Seoul, Korea (South), pp 6568–6577. https://doi.org/10.1109/ICCV.2019.00667

  39. Law H, Deng J (2020) Cornernet: Detecting objects as paired keypoints. Int J Comput Vis 128:642–656. https://doi.org/10.1007/s11263-019-01204-1

    Article  Google Scholar 

  40. Zhu C, He Y, Savvides M (2019) Feature selective Anchor-Free module for Single-Shot object detection. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), Long Beach, CA, USA, 840–849. https://doi.org/10.1109/CVPR.2019.00093

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China [grant numbers No.U1733119]; the Central University basic scientific research business fee project of Civil Aviation University of China [grant numbers No.3122018C001].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maozhen Liu.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, K., Liu, M. YOLOv3-MT: A YOLOv3 using multi-target tracking for vehicle visual detection. Appl Intell 52, 2070–2091 (2022). https://doi.org/10.1007/s10489-021-02491-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02491-3

Keywords