YOLO-SLAM: A semantic SLAM system towards dynamic environment with geometric constraint

Wu, Wenxin; Guo, Liang; Gao, Hongli; You, Zhichao; Liu, Yuekai; Chen, Zhiqiang

doi:10.1007/s00521-021-06764-3

YOLO-SLAM: A semantic SLAM system towards dynamic environment with geometric constraint

Original Article
Published: 08 January 2022

Volume 34, pages 6011–6026, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Wenxin Wu¹,
Liang Guo ORCID: orcid.org/0000-0001-5338-4958¹,
Hongli Gao¹,
Zhichao You¹,
Yuekai Liu¹ &
…
Zhiqiang Chen¹

5657 Accesses
57 Citations
1 Altmetric
Explore all metrics

Abstract

Simultaneous localization and mapping (SLAM), as one of the core prerequisite technologies for intelligent mobile robots, has attracted much attention in recent years. However, the traditional SLAM systems rely on the static environment assumption, which becomes unstable for the dynamic environment and further limits the real-world practical applications. To deal with the problem, this paper presents a dynamic-environment-robust visual SLAM system named YOLO-SLAM. In YOLO-SLAM, a lightweight object detection network named Darknet19-YOLOv3 is designed, which adopts a low-latency backbone to accelerate and generate essential semantic information for the SLAM system. Then, a new geometric constraint method is proposed to filter dynamic features in the detecting areas, where dynamic features can be distinguished by utilizing the depth difference with Random Sample Consensus (RANSAC). YOLO-SLAM composes the object detection approach and the geometric constraint method in a tightly coupled manner, which is able to effectively reduce the impact of dynamic objects. Experiments are conducted on the challenging dynamic sequences of TUM dataset and Bonn dataset to evaluate the performance of YOLO-SLAM. The results demonstrate that the RMSE index of absolute trajectory error can be significantly reduced to 98.13% compared with ORB-SLAM2 and 51.28% compared with DS-SLAM, indicating that YOLO-SLAM is able to effectively improve stability and accuracy in the highly dynamic environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

3D Object Detection for Autonomous Driving: A Comprehensive Survey

Article 27 April 2023

Jiageng Mao, Shaoshuai Shi, … Hongsheng Li

A Survey on Global LiDAR Localization: Challenges, Advances and Open Problems

Article 06 March 2024

Huan Yin, Xuecheng Xu, … Yue Wang

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

Article Open access 08 October 2020

Jonathon Luiten, Aljos̆a Os̆ep, … Bastian Leibe

References

Scona R, Nobili S, Petillot YR, Fallon M (2017) Direct visual SLAM fusing proprioception for a humanoid robot. In: IEEE International Conference on Intelligent Robots and Systems
Aladem M, Rawashdeh SA (2018) Lightweight visual odometry for autonomous mobile robots. Sensors (Switzerland). https://doi.org/10.3390/s18092837
Article Google Scholar
Giubilato R, Chiodini S, Pertile M, Debei S (2019) An evaluation of ROS-compatible stereo visual SLAM methods on a nVidia Jetson TX2. Meas J Int Meas Confed. https://doi.org/10.1016/j.measurement.2019.03.038
Article Google Scholar
Cadena C, Carlone L, Carrillo H et al (2016) Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Trans Robot. https://doi.org/10.1109/TRO.2016.2624754
Article Google Scholar
Montiel JMM (2015) ORB-SLAM: A Versatile and Accurate Monocular 31:1147–1163
Google Scholar
Pumarola A, Vakhitov A, Agudo A et al (2017) PL-SLAM: Real-time monocular visual SLAM with points and lines. Proc IEEE Int Conf Robot Autom. https://doi.org/10.1109/ICRA.2017.7989522
Article Google Scholar
Fuentes-Pacheco J, Ruiz-Ascencio J, Rendón-Mancha JM (2012) Visual simultaneous localization and mapping: a survey. Artif Intell Rev 43:55–81. https://doi.org/10.1007/s10462-012-9365-8
Article Google Scholar
Li P, Qin T, Shen S (2018) Stereo Vision-Based Semantic 3D Object and Ego-Motion Tracking for Autonomous Driving. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 11206 LNCS:664–679. https://doi.org/10.1007/978-3-030-01216-8_40
Siddiqui MK, Islam MZ, Kabir MA (2019) A novel quick seizure detection and localization through brain data mining on ECoG dataset. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3381-9
Article Google Scholar
Mur-Artal R, Tardos JD (2017) ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras. IEEE Trans Robot 33:1255–1262. https://doi.org/10.1109/TRO.2017.2705103
Article Google Scholar
Engel J, Schöps T, Cremers D (2014) LSD-SLAM: Large-Scale Direct monocular SLAM. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 8690 LNCS:834–849. https://doi.org/10.1007/978-3-319-10605-2_54
Endres F, Hess J, Sturm J et al (2014) 3-D Mapping with an RGB-D camera. IEEE Trans Robot 30:177–187. https://doi.org/10.1109/TRO.2013.2279412
Article Google Scholar
Saputra MRU, Markham A, Trigoni N (2018) Visual SLAM and structure from motion in dynamic environments: A survey. ACM Comput Surv. https://doi.org/10.1145/3177853
Article Google Scholar
Wang CC, Thorpe C (2002) Simultaneous localization and mapping with detection and tracking of moving objects. Proc - IEEE Int Conf Robot Autom 3:2918–2924. https://doi.org/10.1109/robot.2002.1013675
Article Google Scholar
Kim DH, Kim JH (2016) Effective background model-based RGB-D dense visual odometry in a dynamic environment. IEEE Trans Robot 32:1565–1573. https://doi.org/10.1109/TRO.2016.2609395
Article Google Scholar
Li S, Lee D (2017) RGB-D SLAM in Dynamic Environments Using Static Point Weighting. IEEE Robot Autom Lett 2:2263–2270. https://doi.org/10.1109/LRA.2017.2724759
Article Google Scholar
Wang R, Wan W, Wang Y, Di K (2019) A new RGB-D SLAM method with moving object detection for dynamic indoor scenes. Remote Sens. https://doi.org/10.3390/rs11101143
Article Google Scholar
Cheng J, Wang C, Meng MQH (2020) Robust Visual Localization in Dynamic Environments Based on Sparse Motion Removal. IEEE Trans Autom Sci Eng 17:658–669. https://doi.org/10.1109/TASE.2019.2940543
Article Google Scholar
Liu Y, Liu Y, Gao H et al (2020) A Data-Flow Oriented Deep Ensemble Learning Method for Real-Time Surface Defect Inspection. IEEE Trans Instrum Meas. https://doi.org/10.1109/TIM.2019.2957849
Article Google Scholar
Li F, Li W, Chen W et al (2020) A Mobile Robot Visual SLAM System with Enhanced Semantics Segmentation. IEEE Access 8:25442–25458. https://doi.org/10.1109/ACCESS.2020.2970238
Article Google Scholar
Zhang L, Wei L, Shen P et al (2018) Semantic SLAM based on object detection and improved octomap. IEEE Access 6:75545–75559. https://doi.org/10.1109/ACCESS.2018.2873617
Article Google Scholar
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 2016-Decem:779–788. https://doi.org/10.1109/CVPR.2016.91
Redmon J, Farhadi A (2017) YOLO9000: Better, faster, stronger. Proc - 30th IEEE Conf Comput Vis Pattern Recognition, CVPR 2017 2017-Janua:6517–6525. https://doi.org/10.1109/CVPR.2017.690
Redmon J, Farhadi A (2018) YOLOv3: An Incremental Improvement
Liu Y, Miura J (2021) RDS-SLAM: Real-Time Dynamic SLAM Using Semantic Segmentation Methods. IEEE Access. https://doi.org/10.1109/ACCESS.2021.3050617
Article Google Scholar
Guo L, Lei Y, Li N et al (2018) Neurocomputing Machinery health indicator construction based on convolutional neural networks considering trend burr. Neurocomputing 292:142–150. https://doi.org/10.1016/j.neucom.2018.02.083
Article Google Scholar
Tan Y, Guo L, Gao H, Zhang L (2021) Network: A Method for Intelligent Fault Diagnosis Between Artificial and Real Damages. 70:
Yu C, Liu Z, Liu XJ et al (2018) DS-SLAM: A Semantic Visual SLAM towards Dynamic Environments. IEEE Int Conf Intell Robot Syst. https://doi.org/10.1109/IROS.2018.8593691
Article Google Scholar
Bescos B, Facil JM, Civera J, Neira J (2018) DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes. IEEE Robot Autom Lett 3:4076–4083. https://doi.org/10.1109/LRA.2018.2860039
Article Google Scholar
Zhao L, Liu Z, Chen J et al (2019) A Compatible Framework for RGB-D SLAM in Dynamic Scenes. IEEE Access 7:75604–75614. https://doi.org/10.1109/ACCESS.2019.2922733
Article Google Scholar
Yang D, Bi S, Wang W et al (2019) DRE-SLAM: Dynamic RGB-D Encoder SLAM for a Differential-Drive Robot. Remote Sens. https://doi.org/10.3390/rs11040380
Article Google Scholar
Li P, Zhao W (2020) Image fire detection algorithms based on convolutional neural networks. Case Stud Therm Eng. https://doi.org/10.1016/j.csite.2020.100625
Article Google Scholar
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. pp 770–778
Zhao L, Liu Z, Chen J et al (2019) A Compatible Framework for RGB-D SLAM in Dynamic Scenes. IEEE Access. https://doi.org/10.1109/ACCESS.2019.2922733
Article Google Scholar
Liu G, Zeng W, Feng B, Xu F (2019) DMS-SLAM: A general visual SLAM system for dynamic scenes with multiple sensors. Sensors (Switzerland). https://doi.org/10.3390/s19173714
Article Google Scholar
Xiao L, Wang J, Qiu X et al (2019) Dynamic-SLAM: Semantic monocular visual localization and mapping based on deep learning in dynamic environment. Rob Auton Syst 117:1–16. https://doi.org/10.1016/j.robot.2019.03.012
Article Google Scholar
Sturm J, Engelhard N, Endres F et al (2012) A benchmark for the evaluation of RGB-D SLAM systems. IEEE Int Conf Intell Robot Syst. https://doi.org/10.1109/IROS.2012.6385773
Article Google Scholar
Palazzolo E, Behley J, Lottes P et al (2019) ReFusion: 3D Reconstruction in Dynamic Environments for RGB-D Cameras Exploiting Residuals. IEEE Int Conf Intell Robot Syst. https://doi.org/10.1109/IROS40897.2019.8967590
Article Google Scholar
Everingham M, Van Gool L, Williams CKI et al (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88:303–338. https://doi.org/10.1007/s11263-009-0275-4
Article Google Scholar
Raguram R, Chum O, Pollefeys M et al (2013) USAC: A universal framework for random sample consensus. IEEE Trans Pattern Anal Mach Intell 35:2022–2038. https://doi.org/10.1109/TPAMI.2012.257
Article Google Scholar
Scona R, Jaimez M, Petillot YR, et al (2018) StaticFusion: Background Reconstruction for Dense RGB-D SLAM in Dynamic Environments. In: Proceedings - IEEE International Conference on Robotics and Automation

Download references

Acknowledgments

Wenxin Wu and Zhichao You contributed equally to this work, as the co-first author of this article. This research was supported in part by the National Natural Science Foundation of China under Grant 51775452, Grant 51905452, in part by Fundamental Research Funds for Central Universities under Grant 2682019CX35, 2682017ZDPY09, in part by China Postdoctoral Science Foundation under Grant 2019M663549, in part by Planning Project of Science & Technology Department of Sichuan Province under Grant 2019YFG0353, and in part by Local Development Foundation guided by the Central Government under Grant 2020ZYD012.

Funding

National Natural Science Foundation of China,51905452,Liang Guo,51775452, Hongli Gao,Local Development Fundatoin guided by the Central Government, 2020ZYD012, Liang Guo,China Postdoctoral Science Foundation,2019M663549, Liang Guo,Planning Project of Science & Technology Department of Sichuan Province under Grant, 2019YFG0353,Hongli Gao, The Fundamental Research Funds for the Cener Universities,2682019CX35, Hongli Gao, 2682017ZDPY09, Hongli Gao

Author information

Authors and Affiliations

School of Mechanical Engineering, Southwest Jiaotong University, Chengdu, 610031, China
Wenxin Wu, Liang Guo, Hongli Gao, Zhichao You, Yuekai Liu & Zhiqiang Chen

Authors

Wenxin Wu
View author publications
You can also search for this author in PubMed Google Scholar
Liang Guo
View author publications
You can also search for this author in PubMed Google Scholar
Hongli Gao
View author publications
You can also search for this author in PubMed Google Scholar
Zhichao You
View author publications
You can also search for this author in PubMed Google Scholar
Yuekai Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhiqiang Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liang Guo.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. The contribution of this paper is our original work, and all authors have agreed to the submission of this work to this journal. The paper has not been published and is not under consideration for publication by another journal.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, W., Guo, L., Gao, H. et al. YOLO-SLAM: A semantic SLAM system towards dynamic environment with geometric constraint. Neural Comput & Applic 34, 6011–6026 (2022). https://doi.org/10.1007/s00521-021-06764-3

Download citation

Received: 25 February 2021
Accepted: 15 November 2021
Published: 08 January 2022
Issue Date: April 2022
DOI: https://doi.org/10.1007/s00521-021-06764-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

YOLO-SLAM: A semantic SLAM system towards dynamic environment with geometric constraint

Abstract

Access this article

Similar content being viewed by others

3D Object Detection for Autonomous Driving: A Comprehensive Survey

A Survey on Global LiDAR Localization: Challenges, Advances and Open Problems

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

YOLO-SLAM: A semantic SLAM system towards dynamic environment with geometric constraint

Abstract

Access this article

Similar content being viewed by others

3D Object Detection for Autonomous Driving: A Comprehensive Survey

A Survey on Global LiDAR Localization: Challenges, Advances and Open Problems

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation