Abstract
In the field of computer vision, creating an effective object detection model is a difficult task. Object detection is the process of predicting objects and locating their instances in a given image. It is used in a wide range of applications, including self-driving cars, navigating visually impaired people in an indoor/outdoor environment, counting crowds, detecting vehicles, tracking objects, etc. Traditionally, the opencv’s feature extraction and feature detecting algorithms are employed to object detection. However, in the real world, the performance of those algorithms is unsatisfactory. In recent days, different deep learning models are available to extract and learn the objects' features to detect the objects and also the performance of these algorithms is acceptable in the real environment. This paper compares the SSD, YOLOv3 and YOLOv4 for detecting objects in the indoor environment and discusses the aspects of sparse and dense prediction for the various deep learning algorithms used for object detection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
Zhao, Z.Q., Zheng, P., Xu, S.T., Wu, X.: Object detection with deep learning: A review. IEEE transactions on neural networks and learning systems 30(11), 3212–3232 (2019)
Liu, L., Ouyang, W., Wang, X., Fieguth, P., Chen, J., Liu, X., Pietikäinen, M.: Deep learning for generic object detection: A survey. Int. J. Comput. Vision 128(2), 261–318 (2020)
Han, J., Zhang, D., Cheng, G., Liu, N., Xu, D.: Advanced deep-learning techniques for salient and category-specific object detection: a survey. IEEE Signal Process. Mag. 35(1), 84–100 (2018)
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: Integrated recognition, localization and detection using convolutional networks (2013). arXiv:1312.6229
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)
Girshick, R.: Fast RCNN. In: Proceedings of the IEEE International Conference on CComputer Vision. pp. 1440–1448 (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster RCNN: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer, Cham (2016)
Redmon, J., & Farhadi, A.: YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., elongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask RCNN (2017). arXiv:1703.06870.
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Redmon, J., Farhadi, A.: YOLOv3: An incremental improvement (2018). arXiv:1804.02767.
Law, H., Deng, J.: Cornernet: Detecting objects as paired keypoints. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 734–750 (2018)
Rashwan, A., Kalra, A., Poupart, P.: Matrix nets: A new deep architecture for object detection. In: Proceedings of the IEEE International Conference on Computer Vision Workshops (2019)
Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., Lin, D.: Libra RCNN: Towards balanced learning for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 821–830 (2019)
Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: Centernet: Keypoint triplets for object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6569–6578 (2019)
Cai, Z., Vasconcelos, N.: Cascade RCNN: high quality object detection and instance segmentation. IEEE Trans. Pattern Anal. Mach. Intell. (2019)
Tian, Z., Shen, C., Chen, H., He, T.: Fcos: Fully convolutional one-stage object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 9627–9636 (2019)
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: Optimal Speed and Accuracy of Object Detection (2020). arXiv:2004.10934.
GitHub: LabelImg: A graphical image annotation tool (2015)
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Kudlur, M.: Tensorflow: A system for large-scale machine learning. In: 12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16), pp. 265–283 (2016)
Bisong, E.: Google colaboratory. In: Building Machine Learning and Deep Learning Models on Google Cloud Platform, pp. 59–64. Apress, Berkeley, CA (2019)
http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_coco_2018_03_29.tar.gz
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Ethics declarations
The competent authorities allowed us to use the images/data as provided in the study. We shall be entirely responsible for any dispute in the future.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Naveenkumar, A., Akilandeswari, J. (2022). Deep Learning Algorithms for Object Detection—A Study. In: Bhateja, V., Tang, J., Satapathy, S.C., Peer, P., Das, R. (eds) Evolution in Computational Intelligence. Smart Innovation, Systems and Technologies, vol 267. Springer, Singapore. https://doi.org/10.1007/978-981-16-6616-2_7
Download citation
DOI: https://doi.org/10.1007/978-981-16-6616-2_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-6615-5
Online ISBN: 978-981-16-6616-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)