Skip to main content
Log in

YOLO-SLAM: A semantic SLAM system towards dynamic environment with geometric constraint

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Simultaneous localization and mapping (SLAM), as one of the core prerequisite technologies for intelligent mobile robots, has attracted much attention in recent years. However, the traditional SLAM systems rely on the static environment assumption, which becomes unstable for the dynamic environment and further limits the real-world practical applications. To deal with the problem, this paper presents a dynamic-environment-robust visual SLAM system named YOLO-SLAM. In YOLO-SLAM, a lightweight object detection network named Darknet19-YOLOv3 is designed, which adopts a low-latency backbone to accelerate and generate essential semantic information for the SLAM system. Then, a new geometric constraint method is proposed to filter dynamic features in the detecting areas, where dynamic features can be distinguished by utilizing the depth difference with Random Sample Consensus (RANSAC). YOLO-SLAM composes the object detection approach and the geometric constraint method in a tightly coupled manner, which is able to effectively reduce the impact of dynamic objects. Experiments are conducted on the challenging dynamic sequences of TUM dataset and Bonn dataset to evaluate the performance of YOLO-SLAM. The results demonstrate that the RMSE index of absolute trajectory error can be significantly reduced to 98.13% compared with ORB-SLAM2 and 51.28% compared with DS-SLAM, indicating that YOLO-SLAM is able to effectively improve stability and accuracy in the highly dynamic environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Scona R, Nobili S, Petillot YR, Fallon M (2017) Direct visual SLAM fusing proprioception for a humanoid robot. In: IEEE International Conference on Intelligent Robots and Systems

  2. Aladem M, Rawashdeh SA (2018) Lightweight visual odometry for autonomous mobile robots. Sensors (Switzerland). https://doi.org/10.3390/s18092837

    Article  Google Scholar 

  3. Giubilato R, Chiodini S, Pertile M, Debei S (2019) An evaluation of ROS-compatible stereo visual SLAM methods on a nVidia Jetson TX2. Meas J Int Meas Confed. https://doi.org/10.1016/j.measurement.2019.03.038

    Article  Google Scholar 

  4. Cadena C, Carlone L, Carrillo H et al (2016) Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Trans Robot. https://doi.org/10.1109/TRO.2016.2624754

    Article  Google Scholar 

  5. Montiel JMM (2015) ORB-SLAM: A Versatile and Accurate Monocular 31:1147–1163

    Google Scholar 

  6. Pumarola A, Vakhitov A, Agudo A et al (2017) PL-SLAM: Real-time monocular visual SLAM with points and lines. Proc IEEE Int Conf Robot Autom. https://doi.org/10.1109/ICRA.2017.7989522

    Article  Google Scholar 

  7. Fuentes-Pacheco J, Ruiz-Ascencio J, Rendón-Mancha JM (2012) Visual simultaneous localization and mapping: a survey. Artif Intell Rev 43:55–81. https://doi.org/10.1007/s10462-012-9365-8

    Article  Google Scholar 

  8. Li P, Qin T, Shen S (2018) Stereo Vision-Based Semantic 3D Object and Ego-Motion Tracking for Autonomous Driving. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 11206 LNCS:664–679. https://doi.org/10.1007/978-3-030-01216-8_40

  9. Siddiqui MK, Islam MZ, Kabir MA (2019) A novel quick seizure detection and localization through brain data mining on ECoG dataset. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3381-9

    Article  Google Scholar 

  10. Mur-Artal R, Tardos JD (2017) ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras. IEEE Trans Robot 33:1255–1262. https://doi.org/10.1109/TRO.2017.2705103

    Article  Google Scholar 

  11. Engel J, Schöps T, Cremers D (2014) LSD-SLAM: Large-Scale Direct monocular SLAM. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 8690 LNCS:834–849. https://doi.org/10.1007/978-3-319-10605-2_54

  12. Endres F, Hess J, Sturm J et al (2014) 3-D Mapping with an RGB-D camera. IEEE Trans Robot 30:177–187. https://doi.org/10.1109/TRO.2013.2279412

    Article  Google Scholar 

  13. Saputra MRU, Markham A, Trigoni N (2018) Visual SLAM and structure from motion in dynamic environments: A survey. ACM Comput Surv. https://doi.org/10.1145/3177853

    Article  Google Scholar 

  14. Wang CC, Thorpe C (2002) Simultaneous localization and mapping with detection and tracking of moving objects. Proc - IEEE Int Conf Robot Autom 3:2918–2924. https://doi.org/10.1109/robot.2002.1013675

    Article  Google Scholar 

  15. Kim DH, Kim JH (2016) Effective background model-based RGB-D dense visual odometry in a dynamic environment. IEEE Trans Robot 32:1565–1573. https://doi.org/10.1109/TRO.2016.2609395

    Article  Google Scholar 

  16. Li S, Lee D (2017) RGB-D SLAM in Dynamic Environments Using Static Point Weighting. IEEE Robot Autom Lett 2:2263–2270. https://doi.org/10.1109/LRA.2017.2724759

    Article  Google Scholar 

  17. Wang R, Wan W, Wang Y, Di K (2019) A new RGB-D SLAM method with moving object detection for dynamic indoor scenes. Remote Sens. https://doi.org/10.3390/rs11101143

    Article  Google Scholar 

  18. Cheng J, Wang C, Meng MQH (2020) Robust Visual Localization in Dynamic Environments Based on Sparse Motion Removal. IEEE Trans Autom Sci Eng 17:658–669. https://doi.org/10.1109/TASE.2019.2940543

    Article  Google Scholar 

  19. Liu Y, Liu Y, Gao H et al (2020) A Data-Flow Oriented Deep Ensemble Learning Method for Real-Time Surface Defect Inspection. IEEE Trans Instrum Meas. https://doi.org/10.1109/TIM.2019.2957849

    Article  Google Scholar 

  20. Li F, Li W, Chen W et al (2020) A Mobile Robot Visual SLAM System with Enhanced Semantics Segmentation. IEEE Access 8:25442–25458. https://doi.org/10.1109/ACCESS.2020.2970238

    Article  Google Scholar 

  21. Zhang L, Wei L, Shen P et al (2018) Semantic SLAM based on object detection and improved octomap. IEEE Access 6:75545–75559. https://doi.org/10.1109/ACCESS.2018.2873617

    Article  Google Scholar 

  22. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 2016-Decem:779–788. https://doi.org/10.1109/CVPR.2016.91

  23. Redmon J, Farhadi A (2017) YOLO9000: Better, faster, stronger. Proc - 30th IEEE Conf Comput Vis Pattern Recognition, CVPR 2017 2017-Janua:6517–6525. https://doi.org/10.1109/CVPR.2017.690

  24. Redmon J, Farhadi A (2018) YOLOv3: An Incremental Improvement

  25. Liu Y, Miura J (2021) RDS-SLAM: Real-Time Dynamic SLAM Using Semantic Segmentation Methods. IEEE Access. https://doi.org/10.1109/ACCESS.2021.3050617

    Article  Google Scholar 

  26. Guo L, Lei Y, Li N et al (2018) Neurocomputing Machinery health indicator construction based on convolutional neural networks considering trend burr. Neurocomputing 292:142–150. https://doi.org/10.1016/j.neucom.2018.02.083

    Article  Google Scholar 

  27. Tan Y, Guo L, Gao H, Zhang L (2021) Network: A Method for Intelligent Fault Diagnosis Between Artificial and Real Damages. 70:

  28. Yu C, Liu Z, Liu XJ et al (2018) DS-SLAM: A Semantic Visual SLAM towards Dynamic Environments. IEEE Int Conf Intell Robot Syst. https://doi.org/10.1109/IROS.2018.8593691

    Article  Google Scholar 

  29. Bescos B, Facil JM, Civera J, Neira J (2018) DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes. IEEE Robot Autom Lett 3:4076–4083. https://doi.org/10.1109/LRA.2018.2860039

    Article  Google Scholar 

  30. Zhao L, Liu Z, Chen J et al (2019) A Compatible Framework for RGB-D SLAM in Dynamic Scenes. IEEE Access 7:75604–75614. https://doi.org/10.1109/ACCESS.2019.2922733

    Article  Google Scholar 

  31. Yang D, Bi S, Wang W et al (2019) DRE-SLAM: Dynamic RGB-D Encoder SLAM for a Differential-Drive Robot. Remote Sens. https://doi.org/10.3390/rs11040380

    Article  Google Scholar 

  32. Li P, Zhao W (2020) Image fire detection algorithms based on convolutional neural networks. Case Stud Therm Eng. https://doi.org/10.1016/j.csite.2020.100625

    Article  Google Scholar 

  33. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings

  34. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. pp 770–778

  35. Zhao L, Liu Z, Chen J et al (2019) A Compatible Framework for RGB-D SLAM in Dynamic Scenes. IEEE Access. https://doi.org/10.1109/ACCESS.2019.2922733

    Article  Google Scholar 

  36. Liu G, Zeng W, Feng B, Xu F (2019) DMS-SLAM: A general visual SLAM system for dynamic scenes with multiple sensors. Sensors (Switzerland). https://doi.org/10.3390/s19173714

    Article  Google Scholar 

  37. Xiao L, Wang J, Qiu X et al (2019) Dynamic-SLAM: Semantic monocular visual localization and mapping based on deep learning in dynamic environment. Rob Auton Syst 117:1–16. https://doi.org/10.1016/j.robot.2019.03.012

    Article  Google Scholar 

  38. Sturm J, Engelhard N, Endres F et al (2012) A benchmark for the evaluation of RGB-D SLAM systems. IEEE Int Conf Intell Robot Syst. https://doi.org/10.1109/IROS.2012.6385773

    Article  Google Scholar 

  39. Palazzolo E, Behley J, Lottes P et al (2019) ReFusion: 3D Reconstruction in Dynamic Environments for RGB-D Cameras Exploiting Residuals. IEEE Int Conf Intell Robot Syst. https://doi.org/10.1109/IROS40897.2019.8967590

    Article  Google Scholar 

  40. Everingham M, Van Gool L, Williams CKI et al (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88:303–338. https://doi.org/10.1007/s11263-009-0275-4

    Article  Google Scholar 

  41. Raguram R, Chum O, Pollefeys M et al (2013) USAC: A universal framework for random sample consensus. IEEE Trans Pattern Anal Mach Intell 35:2022–2038. https://doi.org/10.1109/TPAMI.2012.257

    Article  Google Scholar 

  42. Scona R, Jaimez M, Petillot YR, et al (2018) StaticFusion: Background Reconstruction for Dense RGB-D SLAM in Dynamic Environments. In: Proceedings - IEEE International Conference on Robotics and Automation

Download references

Acknowledgments

Wenxin Wu and Zhichao You contributed equally to this work, as the co-first author of this article. This research was supported in part by the National Natural Science Foundation of China under Grant 51775452, Grant 51905452, in part by Fundamental Research Funds for Central Universities under Grant 2682019CX35, 2682017ZDPY09, in part by China Postdoctoral Science Foundation under Grant 2019M663549, in part by Planning Project of Science & Technology Department of Sichuan Province under Grant 2019YFG0353, and in part by Local Development Foundation guided by the Central Government under Grant 2020ZYD012.

Funding

National Natural Science Foundation of China,51905452,Liang Guo,51775452, Hongli Gao,Local Development Fundatoin guided by the Central Government, 2020ZYD012, Liang Guo,China Postdoctoral Science Foundation,2019M663549, Liang Guo,Planning Project of Science & Technology Department of Sichuan Province under Grant, 2019YFG0353,Hongli Gao, The Fundamental Research Funds for the Cener Universities,2682019CX35, Hongli Gao, 2682017ZDPY09, Hongli Gao

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liang Guo.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. The contribution of this paper is our original work, and all authors have agreed to the submission of this work to this journal. The paper has not been published and is not under consideration for publication by another journal.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, W., Guo, L., Gao, H. et al. YOLO-SLAM: A semantic SLAM system towards dynamic environment with geometric constraint. Neural Comput & Applic 34, 6011–6026 (2022). https://doi.org/10.1007/s00521-021-06764-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06764-3

Keywords

Navigation