Skip to main content
Log in

Vehicle object detection method based on candidate region aggregation

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Multi-scale vehicle detection is an important application in the field of object detection, and Feature Pyramid Network (FPN) is an important means to deal with multi-scale object detection tasks. However, baseline method is the common method used in most of the existing network structure, which represents the input image information by selecting one from the output layer of FPN, and discard other layers. This not only limits the performance of the network structure, but also performs poorly when dealing with the problem of excessive scale differences. To solve this problem, a novelty candidate region aggregation network (CRAN) is proposed in this paper. The candidate regions of different feature layers are effectively aggregated to improve the network generalization performance. Specifically, calculate the similarity between different feature layers through a feature quality score module, and use this as a quantity factor to determine the number of candidate regions reserved for the corresponding feature layer. Finally, they are aggregated into a more comprehensive candidate region group. Further, in order to improve the detection efficiency of small objects, an area cross entropy loss function is proposed. It makes the model pay more attention to small targets by adding a monotonic decrease based on the area. Finally, the proposed CRAN and the area cross entropy loss function are applied to the advanced detectors. The experimental results in the KITTI and UA-DETRAC datasets show that this method has good performance on vehicle objects in different scenarios, and can meet the requirements of practical application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Tian Y, Du Y, Zhang Q, et al. (2020) Depth estimation for advancing intelligent transport systems based on self-improving pyramid stereo network. Inst Eng Technol 14(5):338–345. https://doi.org/10.1049/iet-its.2019.0462

    Article  Google Scholar 

  2. Liu W, Liao S, Hu W (2019) Towards accurate tiny vehicle detection in complex scenes. Neurocomputing 347:24–33

    Article  Google Scholar 

  3. Girshick R, Donahue J, DaT Tell T, Malik J. Rich feature hierarchies for accurate object detection and Semantic segmentation //Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE: 580–587[DOI:10.1109/ CVPR.2014.81]

  4. Girshick R, landola F, Darrell T, Malik J. Deformahle part models are convolutional neural networks// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE: 437–446 [DOI: 10. 1109/ CVPR.2015.7298641]

  5. Ren S, He K, Girshick R et al (2015) Faster R-CNN: towards real-time object detection with region proposal networks//. Adv Neural Inf Process Syst, IEEE. https://doi.org/10.1109/TPAMI.2016.2577031

    Article  Google Scholar 

  6. Dai J, Li Y, He K, Sun J. R-FCN: Object detection via region-based fully convolutional networks. arXiv preprint https://arXiv.org/1605.06409, 2016

  7. Uijlings JRR, van de Sande KEA, Gevers T et al (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171. https://doi.org/10.1007/s11263-013-0620-5

    Article  Google Scholar 

  8. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings IEEE Conference Comput. Vis. Pattern Recognit. (CVPR), pp 779–788

  9. Liu W et al. (2016) SSD: single shot MultiBox detector. In: Proceedings Eur. Conf. Comput. Vis. pp 21–37

  10. Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings IEEE Int. Conf. Comput. Vis. pp 2980–2988

  11. Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement”, arXiv: 180402767 Cs

  12. Fu CY, Liu W, Ranga A et al. (2017) “DSSD: Deconvolutional single shot detector,”. [Online]. Available: https://arxiv.org/1701.06659

  13. He K, Gkioxari G, Doll´ar P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969

  14. Lin TY, Doll´ar P, Girshick R, et al. (2017) Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp2117–2125

  15. Rossi L, Karimi A, Prati A (2021)A novel region of interest extraction layer for instance segmentation. Comput Vis Pat Recog. https://arxiv.org/2004.13665v2

  16. Farahani G (2017) Dynamic and robust method for detection and locating vehicles in the video images sequences with use of image processing algorithm[J]. Springer International Publishing,(1)

  17. Lienhart R, Maydt J (2002) An extended set of Haar-like features for rapid object detection[C]//International Conference on Image Processing, pp 900–903

  18. Lowe DG (2004) Distinctive image features from scale-invariant keypoints[J]. Int J Comput Vis 60(2):91–110

    Article  Google Scholar 

  19. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection[C]//. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition pp 886–893

  20. Felzenszwalb PF, Girshick RB, McAllester D et al (2009) Object detection with discriminatively trained part-based models[J]. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645

    Article  Google Scholar 

  21. Hong-Peng YIN, Bo CHEN, Yi CHAI et al (2016) Vision-based object detection and tracking: a review [J]. Acta Autom Sin 42(10):1466–1489

    MATH  Google Scholar 

  22. Ciresan DC, Meier U, Masci J, et al. (2011) High-performance neural networks for visual object classification [J]. arXiv: 1102. 0183

  23. Everingham M, Eslami SMA, Van Gool L et al (2014) The PASCAL visual object classes challenge: a retrospective. Int J Comput Vis 111:98–136 (2015). https://doi.org/10.1007/s11263-014-0733-5

    Article  Google Scholar 

  24. He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916

    Article  Google Scholar 

  25. Cai Z, Vasconcelos N (2019) Cascade r-cnn: high quality object detection and instance segmentation. In: IEEE transactions on pattern analysis and machine intelligence, pp 1483–1498. https://doi.org/10.1109/TPAMI.2019.2956516

  26. Kong T, Yao A, Chen Y, Sun F (2016) Hypernet: towards accurate region proposal generation and joint object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp 845–853

  27. Lee H, Eum S, Kwon H (2017) In: ME R-CNN: multi-expert region-based CNN for object detection. https://arxiv.org/abs/1704.01069v1

  28. Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp 761–769

  29. Wang X, Shrivastava A, Gupta A (2017) A-fast-RCNN: hard positive generation via adversary for object detection, in: IEEE Conference on Computer Vision and Pattern Recognition, pp 3039–3048

  30. Bell S, Lawrence Zitnick C, Bala K, Girshick R (2016) Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2874–2883

  31. Gidaris S, Komodakis N (2015) Object detection via a multi-region and semantic seg- mentation-aware CNN model. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1134–1142

  32. Shrivastava A, Gupta A (2016) Contextual priming and feedback for faster R-CNN. In: Proceedings of the European Conference on Computer Vision, Springer, pp 330–348

  33. Redmon J, Farhadi A (2017) YOLO9000: Better, faster, stronger. In: Proc. IEEE Conf. Comput. Vis. Pattern Recog, pp 6517–6525

  34. Lee W-J, Kim DW, Kang T-K, Lim M-T (2018) Convolution neural network with selective multi-stage feature fusion: case study on vehicle rear detection. Appl Sci 8:2468. https://doi.org/10.3390/app8122468

    Article  Google Scholar 

  35. Pae DS, Choi IH, Kang TK et al (2018) Vehicle detection framework for challenging lighting driving environment based on feature fusion method using adaptive neuro-fuzzy inference system. Int J Adv Robot Syst. https://doi.org/10.1177/1729881418770545

    Article  Google Scholar 

  36. Guo Y, Xu Y, Li S (2020) Dense construction vehicle detection based on orientation-aware feature fusion convolutional neural network[J]. Autom Constr 112

  37. Wang P, Sun X, Diao W, Fu K (2020) FMSSD: Feature-merged single-shot detection for multiscale objects in large-scale remote sensing imagery. IEEE Trans Geosci Remote Sens 58(5):3377–3390

    Article  Google Scholar 

  38. Gu Y, Wang B, Xu B (2018) A FPN-based framework for vehicle detection in aerial images. In: ICVIP 2018: Proceedings of the 2018 the 2nd international conference on video and image processing, pp 60–64. https://doi.org/10.1145/3301506.3301531

  39. Weymar M, LW A, Hman A, et al (2011) The face is more than its parts--brain dynamics of enhanced spatial attention to schematic threat. Neuroimage 58(3):946-954

    Article  Google Scholar 

  40. Wen L, Du D, Cai Z, et al. (2015) UA-DETRAC: a new benchmark and protocol for multi-object detection and tracking

  41. Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The kitti vision benchmark suite.In: Proceedings of the 2012 IEEE Conference on Com- puter Vision and Pattern Recognition (CVPR)„IEEE, pp 3354–3361

Download references

Acknowledgements

This work was supported by the Nondestructive Detection and Monitoring Technology for High Speed Transportation Facilities, Key Laboratory of Ministry of Industry and Information Technology, and the Fundamental Research Funds for the Central Universities, NO.NJ2020014.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haitao Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, L., Wang, H., Wang, X. et al. Vehicle object detection method based on candidate region aggregation. Pattern Anal Applic 24, 1635–1647 (2021). https://doi.org/10.1007/s10044-021-01009-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-021-01009-4

Keywords

Navigation