Progress in multi-object detection models: a comprehensive survey

Balakrishna, Sivadi; Mustapha, Ahmad Abubakar

doi:10.1007/s11042-022-14131-0

Progress in multi-object detection models: a comprehensive survey

Published: 10 November 2022

Volume 82, pages 22405–22439, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Sivadi Balakrishna¹ &
Ahmad Abubakar Mustapha¹

715 Accesses
2 Altmetric
Explore all metrics

Abstract

Deep learning-based object detection has become popular due to its strong learning ability and advantages in dealing with occlusion, scale transformation, and context changes. In recent years, it has become a research hotspot. This paper presents the current Deep Learning models from Generic and Salient detection models ranging from one-stage to two-stage for multi-object detection in various applications. Nevertheless, we also examined the advantages and some drawbacks of those models. Furthermore, challenges such as variation in object scales, computation time, illumination differing from various applications, and promising research directions of Deep Learning models are discussed. Finally, we proposed Dense PRediction Simplified (DPRS) based on the YOLO model. Backbones play a vital role in enhancing the performance of detection models, and efficient Backbone architecture will be fused to achieve the competitive state-of-art result.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Local Top-Down Module for Object Detection with Multi-scale Features

Recent progresses on object detection: a brief review

Article 26 June 2019

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Availability of data and material (data transparency)

The data & code generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

Ahmed I, Din S, Jeon G, Piccialli F (2019) Exploring deep learning models for overhead view multiple object detection. IEEE Internet Things J 7(7):5737–5744
Article Google Scholar
Ammirato P, Berg AC (2019) A mask-rcnn baseline for probabilistic object detection. arXiv preprint arXiv:1908.03621
Aslam A Irtaza A, Nida N (2020) Object Detection and Localization in Natural Scenes Through Single-Step and Two-Step Models. In: 2020 International Conference on Emerging Trends in Smart Technologies (ICETST), pp. 1–7. IEEE
Bochkovskiy A, Wang C-Y, and Hong-Yuan ML (2004) YOLOv4: Optimal Speed and Accuracy of Object Detection. 2020. arXiv preprint arXiv:2004.10934
Cai Z, Vasconcelos N (2019) Cascade R-CNN: high quality object detection and instance segmentation. IEEE Trans Pattern Anal Mach Intell
Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: European conference on computer vision, pp. 354–370. Springer, Cham
Chen C, Seff A, Kornhauser AL, Xiao J (2015) Deepdriving: learning affordance for direct perception in autonomous driving, in ICCV
Chen X, Ma H, Wan J, Li B, Xia T (2017) Multi-view 3D object detection network for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, HI, the USA, pp 6526–6534
Chen X, Ma H, Wan J, Li B, Xia T (2017) Multi-view 3d object detection network for autonomous driving, in CVPR
Christ PF, Kaissis G, Ettlinger F, Kaissis G (2017) SurvivalNet: predicting patient survival from diffusion weighted magnetic resonance images using cascaded fully convolutional and 3D convolutional neural networks. In: Proceedings of the IEEE international conference on international symposium on biomedical imaging, Melbourne, Australia, pp. 839–843
Croitoru I, Bogolin S-V, Leordeanu M (2017) Unsupervised learning from video to detect foreground objects in single images. In: ICCV
Dai J, Li Y, He K, Sun J (2016) R-fcn: Object detection via region-based fully convolutional networks. arXiv preprint arXiv:1605.06409
Dixit KG, Shreyas M, Chadaga G, Savalgimath SS, Ragavendra Rakshith G, Naveen Kumar MR (2019) Evaluation and evolution of object detection techniques YOLO and R-CNN. Int J Recent Technol Eng 8(3):824–829
Google Scholar
Dollár P, Appel R, Belongie S, Perona P (2014) Fast feature pyramids for object detection. IEEE Trans Pattern Anal Mach Intell 36(8):1532–1545
Article Google Scholar
Dong C (2015) Chen change Loy, Kaiming He, and Xiaoou Tang. "image super-resolution using deep convolutional networks.". IEEE Trans Pattern Anal Mach Intell 38(2):295–307
Article Google Scholar
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2009) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Article Google Scholar
Foley D, O’reilly R (2018) An Evaluation of Convolutional Neural Network Models for Object Detection in Images on Low-End Devices. AICS 2259:1–12
Google Scholar
Fu C-Y, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp. 1440–1448
Grauman K, Darrell T (2005) The pyramid match kernel: Discriminative classification with sets of image features. In: Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1, vol. 2, pp. 1458–1465. IEEE
Hanchinamani SR, Sarkar S, Bhairannawar SS (2016) Design and implementation of high-speed background subtraction algorithm for moving object detection. In: Proceedings of the IEEE international conference on advances in computing, communications and informatics, Jaipur, India, 21–24 September 2016, pp 367–374
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778
He K, Zhang X, Ren S, Sun J (2016 ) Deep residual learning for image recognition. In: CVPR
Hossain S, Lee D-j (2019) Deep learning-based real-time multiple-object detection and tracking from aerial imagery via a flying robot with GPU-based embedded devices. Sensors 19(15):3371
Article Google Scholar
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on Multimedia, pp. 675–678
Jiao L, Zhang F, Liu F, Yang S, Li L, Feng Z, Rong Q (2019) A survey of deep learning-based object detection. IEEE Access 7:128837–128868
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Proces Syst 25:1097–1105
Google Scholar
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), vol. 2, pp. 2169–2178. IEEE
Li Y, Li J, Lin W, Li J (2018) Tiny-DSOD: Lightweight object detection for resource-restricted usages. arXiv preprint arXiv:1807.11013
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision, pp. 21–37. Springer, Cham
Liu Y, Wang Y, Wang S, Liang TT, Zhao Q, Tang Z, Ling H (2020) Cbnet: a novel composite backbone network architecture for object detection. Proc AAAI Conf Art Intell 34(07):11653–11660
Google Scholar
Long ZHOU, Wei S, Zhongma CUI, Jiaqi FANG, Xiaoting YANG, Wei DING (2020) Lira-YOLO: a lightweight model for ship detection in radar images. J Syst Eng Electron 99:1–7
Google Scholar
Lowe G (2004) Sift-the scale invariant feature transform. Int J 2(91–110):2
Google Scholar
Ma L, Yu L, Zhang X, Ye Y, Yin G, Johnson BA (2019) Deep learning in remote sensing applications: a meta-analysis and review. ISPRS J Photogramm Remote Sens 152:166–177
Article Google Scholar
Ma B, Li X, Xia Y, Zhang Y (2020) Autonomous deep learning: a genetic DCNN designer for image classification. Neurocomputing 379:152–161
Article Google Scholar
Malamas EN (2003) Euripides GM Petrakis, Michalis Zervakis, Laurent petit, and Jean-Didier Legat. "a survey on industrial vision systems, applications and tools.". Image Vis Comput 21(2):171–188
Article Google Scholar
Mao J, Xiao T, Jiang Y, Cao Z (2017) What can help pedestrian detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, Hawaii, HI, the USA, pp 3127–3136
Mauri A, Khemmar R, Decoux B, Ragot N, Rossi R, Trabelsi R, Boutteau R, Ertaud J-Y, Savatier X (2020) Deep learning for real-time 3D multi-object detection, localisation, and tracking: application to smart mobility. Sensors 20(2):532
Article Google Scholar
Maximilian F, Liu Y, Engstle Armin, and Schneider Stefan-Alexander (2019) Deep learning-based multi-scale multi-object detection and classification for autonomous driving. In: Fahrerassistenzsysteme 2018, pp. 233–242. Springer Vieweg, Wiesbaden
Mhalla A, Chateau T, Amara NEB (2019) Spatio-temporal object detection by deep learning: video-interlacing to improve multi-object tracking. Image Vis Comput 88:120–131
Article Google Scholar
Murthy CB, Hashmi MF, Bokde ND, Geem ZW (2020) Investigations of object detection in images/videos using various deep learning techniques and embedded platforms—a comprehensive review. Appl Sci 10(9):3280
Article Google Scholar
Newell A, Yang K, Deng J (2016) Stacked hourglass networks for human pose estimation. In: European conference on computer vision, pp. 483–499. Springer, Cham
Pal SK, Pramanik A, Maiti J, Mitra P (2021) Deep learning in multi-object detection and tracking: state of the art. Appl Intell 51(9):6400–6429
Article Google Scholar
Pathak AR, Pandey M, Rautaray S (2018) Application of deep learning for object detection. Procedia Comput Sci 132:1706–1717
Article Google Scholar
Poeppel D (2012) The maps problem and the mapping problem: two challenges for a cognitive neuroscience of speech and language. Cognitive Neuropsychol 29(1–2):34–55
Article Google Scholar
Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7263–7271
Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767
Redmon J, Santosh D, Ross G, Ali F (2016) You only look once: Unified, real-time object detection. In: Pro-ceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788
Ren S, He K, Girshick R, Sun J (2016) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Article Google Scholar
Senicic M, Matijevic M, Nikitovic M (2018) Teaching the methods of object detection by robot vision”. In Proceedings of the IEEE International Convention on Information and Communication Technology, Electronics and Microelectronics, Opatija, Croatia, pp. 558–563
Shaikh SH, Khalid S, Nabendu C (2014) Moving object detection approaches, challenges and object tracking. In: Moving object detection using background subtraction, pp. 5–14. Springer, Cham
Shen Z, Liu Z, Li J, Jiang Y-G, Chen Y, Xue X (2017) Dsod: learning deeply supervised object detectors from scratch. In: Proceedings of the IEEE international conference on computer vision, pp 1919-1927
Sreenu G, Durai M (2019) Intelligent video surveillance: A review through deep learning techniques for crowd analysis. J Big Data 6:48–75
Article Google Scholar
Sung KK, Poggio T (2002) Example-based learning for view-based human face detection. IEEE Trans Pattern Anal Mach Intell 20(1):39–51
Article Google Scholar
Timofte, Radu, De Smet V, Luc Van G (2013) Anchored neighborhood regression for fast example-based super-resolution. In: Proceedings of the IEEE international conference on computer vision, pp. 1920–1927
Uijlings JRR, Koen Van De Sande EA, Gevers T, Smeulders AWM (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171
Article Google Scholar
Wang C, Ren W, Huang K, Tan T (2014) Weakly supervised object localization with latent category learning. In: ECCV
Wang C-Y, Liao H-YM, Wu Y-H, Chen P-Y, Hsieh J-W, Yeh I-H (2020) CSPNet: A new backbone that can enhance learning capability of CNN. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp. 390–391
Weimer D, Scholz-Reiter B, Shpitalni M (2016) Design of deep convolutional neural network architectures for automated feature extraction in industrial inspection. CIRP Ann 65:417–420
Article Google Scholar
Wu J (2018) Complexity and accuracy analysis of common artificial neural networks on pedestrian detection. In: MATEC Web of Conferences232. p. 01003. EDP Sciences
Wu X, Sahoo D, Hoi SCH (2020) Recent advances in deep learning for object detection. Neurocomputing 396:39–64
Article Google Scholar
Yang J, Wright J, Huang T, Ma Y (2008) Image super-resolution as sparse representation of raw image patches. In: 2008 IEEE conference on computer vision and pattern recognition, pp. 1–8. IEEE
Yang J, Wright J, Huang TS, Ma Y (2010) Image super-resolution via sparse representation. IEEE Trans Image Process 19(11):2861–2873
Article MathSciNet MATH Google Scholar
Yu X, Choi W, Lin Y, Savarese S (2017) Subcategory-aware convolutional neural networks for object proposals and detection. In: 2017 IEEE winter conference on applications of computer vision (WACV), pp. 924–933. IEEE
Zeiler, Matthew D., and Rob Fergus (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision, pp. 818–833. Springer, Cham
Zhao L, Li S (2020) Object detection algorithm based on improved YOLOv3. Electronics 9(3):537
Article Google Scholar
Zhao Z, Zheng P, Xu S, Wu X (2019) Object Detection with Deep Learning: A Review. IEEE Trans Neural Netw Learn Syst 30:3212–3232
Article Google Scholar
Zhao Z-Q, Zheng P, Xu S-t, Xindong W (2019) Object detection with deep learning: a review. IEEE Transact Neural Net Learning Syst 30(11):3212–3232
Article Google Scholar
Zhou X, Gong W, Fu W, Du F (2017) Application of deep learning in object detection. In: Proceedings of the IEEE/ACIS 16th international conference on computer and information science, Wuhan, China, pp 631–634
Zitnick C (2014) Lawrence, and Piotr Dollár. Edge boxes: Locating object proposals from edges. In: European conference on computer vision, pp. 391–405. Springer, Cham
Zou Z, Shi Z, Guo Y, Ye J (2019) Object detection in 20 years: a survey. arXiv preprint arXiv:1905.05055

Download references

Acknowledgements

The authors thankful to all reviewers for their thorough reading of this manuscript and for the thoughtful comments and constructive suggestions, which help us to improve the quality of the manuscript.

Author information

Authors and Affiliations

Department of Computer Science & Engineering, Vigan’s Foundation for Science, Technology & Research (Deemed to be University), Vadlamudi, Guntur, AP, India
Sivadi Balakrishna & Ahmad Abubakar Mustapha

Authors

Sivadi Balakrishna
View author publications
You can also search for this author inPubMed Google Scholar
Ahmad Abubakar Mustapha
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

The authors confirm contribution to the paper as follows:

Sivadi Balakrishna- Study Conceptualization & design, methodology, investigation, writing—review and editing, supervision.

Ahmad Abubakar Mustapha- Data Collection, software, validation, formal analysis, resources, writing—original draft preparation.

Corresponding author

Correspondence to Sivadi Balakrishna.

Ethics declarations

Conflicts of interest

The authors of the manuscript declare that there is no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Balakrishna, S., Mustapha, A.A. Progress in multi-object detection models: a comprehensive survey. Multimed Tools Appl 82, 22405–22439 (2023). https://doi.org/10.1007/s11042-022-14131-0

Download citation

Received: 18 April 2021
Revised: 01 October 2022
Accepted: 25 October 2022
Published: 10 November 2022
Issue Date: June 2023
DOI: https://doi.org/10.1007/s11042-022-14131-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Progress in multi-object detection models: a comprehensive survey

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Local Top-Down Module for Object Detection with Multi-scale Features

Recent progresses on object detection: a brief review

Object detection using YOLO: challenges, architectural successors, datasets and applications

Explore related subjects

Availability of data and material (data transparency)

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now