Survey on Image Object Detection Algorithms Based on Deep Learning

Fang, Wei; Shen, Liang; Chen, Yupeng

doi:10.1007/978-3-030-78609-0_40

Survey on Image Object Detection Algorithms Based on Deep Learning

Wei Fang^12,13,
Liang Shen¹² &
Yupeng Chen¹²

Conference paper
First Online: 09 July 2021

1996 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12736))

Abstract

With the development of image processing technology, computer vision is becoming more and more popular. In recent years, deep learning has flourished, significant progress has been made in object detection. Especially after the R-CNN framework was proposed, the object detection framework based on deep learning has gradually become the mainstream, which can be divided into two categories: region-based and regression-based. Taking these two types of frameworks as the main body, this paper summarizes the research background and then discusses the object detection algorithms based on candidate regions represented by Faster R-CNN and the algorithms based on regression represented by the YOLO series. According to the development history, this paper summarizes the framework proposed in recent years, compares and analyzes the performance of object detection algorithms on public datasets, and introduces the application scenarios of those algorithms. Finally, this paper discusses the current difficulties and challenges in object detection and looks forward to the future development direction.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Fang, W., Pang, L., Yi, W.: Survey on the application of deep reinforcement learning in image processing, J. Artif. Intell. 2(1), 39–58 (2020)
Google Scholar
Szeliski, R.: Computer vision: Algorithms and Applications. Springer, Berlin (2010). https://doi.org/10.1007/978-1-84882-935-0
Zhu, D., Luo, Y., Dai, L., et al.: Salient object detection via a local and global method based on deep residual network. J. Vis. Commun. Image Representation 54, 1–9 (2018)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, vol. 1, pp. 886–893 (2005)
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., et al.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2009)
Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)
Google Scholar
Neubeck, A., Van Gool, L.: Efficient non-maximum suppression. In: International Conference on Pattern Recognition, ICPR, pp. 850–855 (2006)
Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Szegedy, C., Liu, W., Jia, Y., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Google Scholar
He, K., Zhang, X., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Fang, W., Zhang, F., Ding, Y., Sheng, J.: A new sequential image prediction method based on LSTM and DCGAN. Comput. Mater. Continua 64(1), 217–231 (2020)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Google Scholar
Everingham, M., Van Gool, L., et al.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Google Scholar
Uijlings, J.R.R., Van De Sande, K.E., Gevers, T., et al.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013)
Google Scholar
Ren, S., He, K., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Google Scholar
He, K., Gkioxari, G., Dollár, P., et al.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Google Scholar
Li, Y., Chen, Y., et al.: Scale-aware trident networks for object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6054–6063 (2019).
Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Fu, C.Y., Liu, W., Ranga, A., Tyagi, A., et al.: DSSD: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659 (2017)
Jeong, J., Park, H., Kwak, N.: Enhancement of SSD by concatenating feature maps for object detection. arXiv preprint arXiv:1705.09587 (2017)
Li, Z., Zhou, F.: Feature fusion single shot multibox detector. arXiv preprint arXiv:1712.00960 (2017)
Yi, J., Wu, P., Metaxas, D.N.: ASSD: attentive single shot multibox detector. Computer Vision and Image Understanding, vol. 189, p. 102827 (2019)
Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronge. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
Hartigan, J.A., Wong, M.A.: Algorithm AS 136: a k-means clustering algorithm. J. Royal Stat. Soc. 28(1), 100–108 (1979)
Google Scholar
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv preprint arXiv:2004.10934 (2020)
Lin, T.Y., Goyal, P., Girshick, R., et al.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Google Scholar
Chen, K., Li, J., Lin, W., et al.: Towards accurate one-stage object detection with ap-loss. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5119–5127 (2019)
Google Scholar

Download references

Acknowledgement

This work was supported by the National Natural Science Foundation of China (Grant No.42075007), the Open Project of Provincial Key Laboratory for Computer Information Processing Technology under Grant KJS1935, Soochow University,and the Priority Academic Program Development of Jiangsu Higher Education Institutions.

Author information

Authors and Affiliations

School of Computer and Software, Engineering Research Center of Digital Forensics, Ministry of Education, Nanjing University of Information Science and Technology, Nanjing, China
Wei Fang, Liang Shen & Yupeng Chen
Provincial Key Laboratory for Computer Information Processing Technology, Soochow University, Suzhou, China
Wei Fang

Authors

Wei Fang
View author publications
You can also search for this author in PubMed Google Scholar
Liang Shen
View author publications
You can also search for this author in PubMed Google Scholar
Yupeng Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Fang .

Editor information

Editors and Affiliations

Nanjing University of Information Science and Technology, Nanjing, China
Xingming Sun
Nanjing University of Information Science and Technology, Nanjing, China
Xiaorui Zhang
Jinan University, Guangzhou, China
Zhihua Xia
Purdue University, West Lafayette, IN, USA
Elisa Bertino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fang, W., Shen, L., Chen, Y. (2021). Survey on Image Object Detection Algorithms Based on Deep Learning. In: Sun, X., Zhang, X., Xia, Z., Bertino, E. (eds) Artificial Intelligence and Security. ICAIS 2021. Lecture Notes in Computer Science(), vol 12736. Springer, Cham. https://doi.org/10.1007/978-3-030-78609-0_40

Download citation

DOI: https://doi.org/10.1007/978-3-030-78609-0_40
Published: 09 July 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-78608-3
Online ISBN: 978-3-030-78609-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics