Skip to main content

Survey on Image Object Detection Algorithms Based on Deep Learning

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12736))

Abstract

With the development of image processing technology, computer vision is becoming more and more popular. In recent years, deep learning has flourished, significant progress has been made in object detection. Especially after the R-CNN framework was proposed, the object detection framework based on deep learning has gradually become the mainstream, which can be divided into two categories: region-based and regression-based. Taking these two types of frameworks as the main body, this paper summarizes the research background and then discusses the object detection algorithms based on candidate regions represented by Faster R-CNN and the algorithms based on regression represented by the YOLO series. According to the development history, this paper summarizes the framework proposed in recent years, compares and analyzes the performance of object detection algorithms on public datasets, and introduces the application scenarios of those algorithms. Finally, this paper discusses the current difficulties and challenges in object detection and looks forward to the future development direction.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Fang, W., Pang, L., Yi, W.: Survey on the application of deep reinforcement learning in image processing, J. Artif. Intell. 2(1), 39–58 (2020)

    Google Scholar 

  2. Szeliski, R.: Computer vision: Algorithms and Applications. Springer, Berlin (2010). https://doi.org/10.1007/978-1-84882-935-0

  3. Zhu, D., Luo, Y., Dai, L., et al.: Salient object detection via a local and global method based on deep residual network. J. Vis. Commun. Image Representation 54, 1–9 (2018)

    Google Scholar 

  4. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Google Scholar 

  5. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, vol. 1, pp. 886–893 (2005)

    Google Scholar 

  6. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., et al.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2009)

    Google Scholar 

  7. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)

    Google Scholar 

  8. Neubeck, A., Van Gool, L.: Efficient non-maximum suppression. In: International Conference on Pattern Recognition, ICPR, pp. 850–855 (2006)

    Google Scholar 

  9. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Google Scholar 

  10. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  11. Szegedy, C., Liu, W., Jia, Y., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

    Google Scholar 

  12. He, K., Zhang, X., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  13. Fang, W., Zhang, F., Ding, Y., Sheng, J.: A new sequential image prediction method based on LSTM and DCGAN. Comput. Mater. Continua 64(1), 217–231 (2020)

    Google Scholar 

  14. Girshick, R., Donahue, J., Darrell, T., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)

    Google Scholar 

  15. Everingham, M., Van Gool, L., et al.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)

    Google Scholar 

  16. Uijlings, J.R.R., Van De Sande, K.E., Gevers, T., et al.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013)

    Google Scholar 

  17. Ren, S., He, K., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)

    Google Scholar 

  18. He, K., Gkioxari, G., Dollár, P., et al.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)

    Google Scholar 

  19. Li, Y., Chen, Y., et al.: Scale-aware trident networks for object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6054–6063 (2019).

    Google Scholar 

  20. Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  21. Fu, C.Y., Liu, W., Ranga, A., Tyagi, A., et al.: DSSD: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659 (2017)

  22. Jeong, J., Park, H., Kwak, N.: Enhancement of SSD by concatenating feature maps for object detection. arXiv preprint arXiv:1705.09587 (2017)

  23. Li, Z., Zhou, F.: Feature fusion single shot multibox detector. arXiv preprint arXiv:1712.00960 (2017)

  24. Yi, J., Wu, P., Metaxas, D.N.: ASSD: attentive single shot multibox detector. Computer Vision and Image Understanding, vol. 189, p. 102827 (2019)

    Google Scholar 

  25. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

    Google Scholar 

  26. Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

    Google Scholar 

  27. Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronge. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)

    Google Scholar 

  28. Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)

  29. Hartigan, J.A., Wong, M.A.: Algorithm AS 136: a k-means clustering algorithm. J. Royal Stat. Soc. 28(1), 100–108 (1979)

    Google Scholar 

  30. Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

  31. Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv preprint arXiv:2004.10934 (2020)

  32. Lin, T.Y., Goyal, P., Girshick, R., et al.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

    Google Scholar 

  33. Chen, K., Li, J., Lin, W., et al.: Towards accurate one-stage object detection with ap-loss. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5119–5127 (2019)

    Google Scholar 

Download references

Acknowledgement

This work was supported by the National Natural Science Foundation of China (Grant No.42075007), the Open Project of Provincial Key Laboratory for Computer Information Processing Technology under Grant KJS1935, Soochow University,and the Priority Academic Program Development of Jiangsu Higher Education Institutions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Fang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fang, W., Shen, L., Chen, Y. (2021). Survey on Image Object Detection Algorithms Based on Deep Learning. In: Sun, X., Zhang, X., Xia, Z., Bertino, E. (eds) Artificial Intelligence and Security. ICAIS 2021. Lecture Notes in Computer Science(), vol 12736. Springer, Cham. https://doi.org/10.1007/978-3-030-78609-0_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-78609-0_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-78608-3

  • Online ISBN: 978-3-030-78609-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics