Abstract
Indoor mobile robots, especially those for the elderly and the disabled, are becoming more and more important to improve their quality of life. The strong interest related to this field can be explained by that the robots can help people grasp or carry things. Accurate detection and localization of target in indoor environment is the premise of this task. Aiming to complete this work, a novel indoor target detection and localization method based on improved YOLOv5 is proposed in this paper for indoor mobile robot equipped with KinectV2 camera. First, we made an indoor scene dataset containing 2000 RGB images and 2000 depth images to enhance the robustness of the 2D detection model in the case of image blur, strong and weak illumination and target occlusion. Second, we proposed an improved YOLOv5-S network for indoor 2D target detection and verified its effectiveness from both theoretical and experimental aspects. When tested on our dataset, our improved YOLOv5-S target detection method achieves the mAP@0.5 indicator of 95.9% and the FPS indicator of 65.36. Third, we proposed an improved mean filtering method to process the depth value of the target center point, so as to solve the noise problem of depth image. Finally, we deduced and sorted out the transformation formula of the target center point from the 2D pixel coordinate system to the 3D camera coordinate system, and used the chessboard calibration method to calibrate our KinectV2 camera, so as to realize the 3D localization of the target center point. When conducting localization experiments in the range of 0.5 m–3 m, the MAE indicator of the localization results of our proposed method is only 11.59 mm, which proves the effectiveness of our proposed method.
Similar content being viewed by others
Data availability
The processed data required to reproduce these findings cannot be shared at this time as the data also forms part of an ongoing study.
References
Afif M, Ayachi R, Said Y, Pissaloux E, Atri M (2020) An evaluation of RetinaNet on indoor object detection for blind and visually impaired persons assistance navigation. Neural Process Lett 51:2265–2279
Afif M, Ayachi R, Pissaloux E, Said Y, Atri M (2020) Indoor objects detection and recognition for an ICT mobility assistance of visually impaired people. Multimed Tools Appl 79:31645–31662
Amad-ud-Din, Halin IA, Shafie SB (2009) A review on solid state time of flight TOF range image sensors. In: 2009 IEEE Student Conference on Research and Development, pp 246–249
Biswas K, Kumar S et al (2021) SMU: smooth activation function for deep networks using smoothing maximum technique. arXiv preprint http://arXiv.org/2111.04682
Bochkovskiy A, Wang CY et al (2020) Yolov4: optimal speed and accuracy of object detection. arXiv preprint http://arXiv.org/2004.10934
Breiman L (2001) Random forests. Mach Learn 45:5–32
Cai YX (2020) Li HJ et al. Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design. arXiv preprint, YOLObile http://arXiv.org/2009.05697
Chen M, Ren XM et al (2020) Real-time indoor object detection based on deep learning and gradient harmonizing mechanism. In: 2020 IEEE 9th data driven control and learning systems conference, pp 772-777
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE conference on computer vision and pattern recognition, pp 886-893
Ding XT, Li BQ, Wang JB (2021) Geometric property-based convolutional neural network for indoor object detection. Int J Adv Robot Syst 18:172988142199332. https://doi.org/10.1177/1729881421993323
Feng YX, He GT, Wu QZ (2016) A new motion obstacle detection based monocular-vision algorithm. In: 2016 international conference on computational intelligence and applications, pp 31–35
Ge Z, Liu ST et al (2021) YOLOX: exceeding YOLO series in 2021. arXiv preprint http://arXiv.org/2107.08430
Glorot X, Bordes A et al (2011) Deep sparse rectifier neural networks. Proceedings of the fourteenth international conference on artificial intelligence and statistics, In, pp 315–323
Hu T, Zhang H, Zhu XY, Clunis J, Yang G (2018) Depth sensor based human detection for indoor surveillance. Futur Gener Comput Syst 88:540–551
Jung J, Yoon S, Ju S, Heo J (2015) Development of kinematic 3D laser scanning system for indoor mapping and as-built BIM using constrained SLAM. Sensors 15:26430–26456
Kim HS, Choi JS (2008) Advanced indoor localization using ultrasonic sensor and digital compass. In: 2008 international conference on control, automation and systems, pp 223-226
Lin TY, Goyal P, Girshick R, He K, Dollar P (2020) Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell 42:318–327
Liu W, Anguelov D et al (2016) SSD: single shot multibox detector. In: computer vision – ECCV 2016, pp 9905:21-37
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–110
Lu FX, Peng HT et al (2020) InstanceFusion: real-time instance-level 3D reconstruction using a single RGBD camera. In: 28th Pacific conference on computer graphics and applications, pp 433-445
Maas AL, Hannun AY et al (2013) Rectifier nonlinearities improve neural network acoustic models. In, Proceedings of the thirteenth international conference on machine learning, p 28
Morar A, Moldoveanu A, Mocanu I, Moldoveanu F, Radoi IE, Asavei V, Gradinaru A, Butean A (2020) A comprehensive survey of indoor localization methods based on computer vision. Sensors. 20. https://doi.org/10.3390/s20092641
Qi CR, Liu W et al (2018) Frustum PointNets for 3D object detection from RGB-D data. In: 31st IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 918-927
Qu SY, Meng C (2014) Statistical classification based fast drivable region detection for indoor Mobile robot. Int J HR 11:1450010. https://doi.org/10.1142/S0219843614500108
Quan L, Pei D, Wang BB et al (2017) Research on human target recognition algorithm of home service robot based on fast-RCNN. International Conference on Intelligent Computation Technology and Automation, In, pp 369–373
Redmon J, Farhadi A (2018) YOLOv3: An Incremental Improvement. arXiv preprint http://arXiv.org/1804.02767
Redmon J, Divvala S et al (2016) You only look once: unified, real-time object detection. In: 2016 IEEE conference on computer vision and pattern recognition, pp 779–788
Redmon J, Farhadi A et al (2017) YOLO9000: better, faster, stronger. In: 2017 IEEE conference on computer vision and pattern recognition, pp 6517–6525
Ren SQ, He KM, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. arXiv preprint http://arXiv.org/1506.01497
Rezatofighi H, Tsoi N et al (2019) Generalized intersection over Union: a metric and a loss for bounding box regression. In: 2019 IEEE/CVF conference on computer vision and pattern recognition, pp 658–666
Sabir MFS, Mehmood I et al (2022) An automated real-time face mask detection system using transfer learning with faster-rcnn in the era of the covid-19 pandemic. Comput Mater Contin 71:4151–4166
Sun H, Meng ZH et al (2018) A 3D convolutional neural network towards real-time Amodal 3D object detection. In: 25th IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 8331-8338
Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57:137–154
Wang S, Sui HG et al (2022) CDSFusion: dense semantic SLAM for indoor environment using CPU computing. Remote Sens 14. https://doi.org/10.3390/rs14040979
Wu XD, Kumar V, Ross Quinlan J, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Yu PS, Zhou ZH, Steinbach M, Hand DJ, Steinberg D (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14:1–37
Xia JH, Gong J (2021) Precise indoor localization with 3D facility scan data. Comput-Aided Civ Infrastruct Eng 37:1243–1259. https://doi.org/10.1111/mice.12795
Xie Q, Lai YK, Wu J, Wang Z, Zhang Y, Xu K, Wang J (2021) Vote-based 3D object detection with context modeling and SOB-3DNMS. Int J Comput Vis 129:1857–1874. https://doi.org/10.1007/s11263-021-01456-w
Xu YF, Chen J, Yang QN, Guo Q (2019) Human posture recognition and fall detection using Kinect V2 camera. In: 2019 Chinese control conference, pp 8488-8493
Yan B, Fan P, Lei X, Liu Z, Yang F (2021) A real-time apple targets detection method for picking robot based on improved YOLOv5. Remote Sens 13. https://doi.org/10.3390/rs13091619
Zhang ZY (1999) Flexible camera calibration by viewing a plane from unknown orientations. Proceedings of the seventh international conference on computer vision, In, pp 666–673
Zhang Y, Chen HS, Luo Y (2014) A Novel Infrared Landmark Indoor Positioning Method Based on Improved IMM-UKF. In: A novel infrared landmark indoor positioning method based on improved IMM-UKF. Applied Mechanics and Materials, In, pp 880–885
Zheng ZH, Wang P et al (2020) Distance-IoU loss: faster and better learning for bounding box regression. AAAI Conference on Artificial Intelligence, In, pp 12993–13000
Zhou XY, Wang DQ et al (2019) Objects as points. arXiv preprint https://doi.org/10.48550/arXiv.1904.07850
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
None of the authors of this paper has a financial or personal relationship with other people or organizations that could inappropriately influence or bias the content of the paper. It is to specifically state that “No Competing interests are at stake and there is No Conflict of interest” with other people or organizations that could inappropriately influence or bias the content of the paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Qian, W., Hu, C., Wang, H. et al. A novel target detection and localization method in indoor environment for mobile robot based on improved YOLOv5. Multimed Tools Appl 82, 28643–28668 (2023). https://doi.org/10.1007/s11042-023-14569-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-14569-w