ABSTRACT
Visual SLAM is easily interfered by movable objects in dynamic scenes, which reduces the localization accuracy and robustness due to existence of inaccurate key points on movable objects. To address this problem, this paper proposes a visual SLAM algorithm for dynamic scenes based on target detection in RGB-D images. The algorithm first identifies movable objects in the scene using the Yolov5 target detector, whose results will be transmitted into a SLAM framework through socket communication. Then a threshold operation on a depth map is used to generate a mask of movable objects have been removed are inputted into the ORB-SLAM2 system. Experimental results show that the proposed algorithm successfully handles dynamic scenes, obtaining a better balance between processing speed and localization accuracy of the reconstructed map comparing with some other SLAM system for dynamic scenes.
- Yongbao Ai, Ting Rui, Ming Lu, Lei Fu, Shuai Liu, and Song Wang. 2020. DDL-SLAM: A robust RGB-D SLAM in dynamic environments combined with deep learning. Ieee Access 8 (2020), 162335–162342.Google ScholarCross Ref
- Vijay Badrinarayanan, Alex Kendall, and Roberto Cipolla. 2017. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence 39, 12 (2017), 2481–2495.Google Scholar
- Irene Ballester, Alejandro Fontán, Javier Civera, Klaus H Strobl, and Rudolph Triebel. 2021. DOT: Dynamic object tracking for visual SLAM. In 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 11705–11711.Google ScholarDigital Library
- Berta Bescos, José M Fácil, Javier Civera, and José Neira. 2018. DynaSLAM: Tracking, mapping, and inpainting in dynamic scenes. IEEE Robotics and Automation Letters 3, 4 (2018), 4076–4083.Google ScholarCross Ref
- Carlos Campos, Richard Elvira, Juan J Gómez Rodríguez, José MM Montiel, and Juan D Tardós. 2021. Orb-slam3: An accurate open-source library for visual, visual–inertial, and multimap slam. IEEE Transactions on Robotics 37, 6 (2021), 1874–1890.Google ScholarCross Ref
- Weichen Dai, Yu Zhang, Ping Li, Zheng Fang, and Sebastian Scherer. 2020. Rgb-d slam in dynamic environments using point correlations. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 1 (2020), 373–389.Google ScholarDigital Library
- Andrew J Davison, Ian D Reid, Nicholas D Molton, and Olivier Stasse. 2007. MonoSLAM: Real-time single camera SLAM. IEEE transactions on pattern analysis and machine intelligence 29, 6 (2007), 1052–1067.Google Scholar
- Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision. 2961–2969.Google ScholarCross Ref
- Wolfgang Hess, Damon Kohler, Holger Rapp, and Daniel Andor. 2016. Real-time loop closure in 2D LIDAR SLAM. In 2016 IEEE international conference on robotics and automation (ICRA). IEEE, 1271–1278.Google ScholarDigital Library
- Yi Lin, Fei Gao, Tong Qin, Wenliang Gao, Tianbo Liu, William Wu, Zhenfei Yang, and Shaojie Shen. 2018. Autonomous aerial navigation using monocular visual-inertial fusion. Journal of Field Robotics 35, 1 (2018), 23–51.Google ScholarCross Ref
- Sherif AS Mohamed, Mohammad-Hashem Haghbayan, Tomi Westerlund, Jukka Heikkonen, Hannu Tenhunen, and Juha Plosila. 2019. A survey on odometry for autonomous navigation systems. IEEE access 7 (2019), 97466–97486.Google ScholarCross Ref
- Raul Mur-Artal, Jose Maria Martinez Montiel, and Juan D Tardos. 2015. ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE transactions on robotics 31, 5 (2015), 1147–1163.Google Scholar
- Raul Mur-Artal and Juan D Tardós. 2017. Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras. IEEE transactions on robotics 33, 5 (2017), 1255–1262.Google ScholarDigital Library
- Gokul B Nair, Swapnil Daga, Rahul Sajnani, Anirudha Ramesh, Junaid Ahmed Ansari, Krishna Murthy Jatavallabhula, and K Madhava Krishna. 2020. Multi-object monocular SLAM for dynamic environments. In 2020 IEEE Intelligent Vehicles Symposium (IV). IEEE, 651–657.Google ScholarCross Ref
- Emanuele Palazzolo, Jens Behley, Philipp Lottes, Philippe Giguere, and Cyrill Stachniss. 2019. ReFusion: 3D reconstruction in dynamic environments for RGB-D cameras exploiting residuals. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 7855–7862.Google ScholarDigital Library
- Tong Qin, Peiliang Li, and Shaojie Shen. 2018. Vins-mono: A robust and versatile monocular visual-inertial state estimator. IEEE Transactions on Robotics 34, 4 (2018), 1004–1020.Google ScholarDigital Library
- Martin Rünz and Lourdes Agapito. 2017. Co-fusion: Real-time segmentation, tracking and fusion of multiple objects. In 2017 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 4471–4478.Google ScholarDigital Library
- Raluca Scona, Mariano Jaimez, Yvan R Petillot, Maurice Fallon, and Daniel Cremers. 2018. Staticfusion: Background reconstruction for dense rgb-d slam in dynamic environments. In 2018 IEEE international conference on robotics and automation (ICRA). IEEE, 3849–3856.Google ScholarDigital Library
- Guangjun Shi, Xiangyang Xu, and Yaping Dai. 2013. SIFT feature point matching based on improved RANSAC algorithm. In 2013 5th International Conference on Intelligent Human-Machine Systems and Cybernetics, Vol. 1. IEEE, 474–477.Google ScholarDigital Library
- Linlin Xia, Jiashuo Cui, Ran Shen, Xun Xu, Yiping Gao, and Xinying Li. 2020. A survey of image semantics-based visual simultaneous localization and mapping: Application-oriented solutions to autonomous navigation of mobile robots. International Journal of Advanced Robotic Systems 17, 3 (2020), 1729881420919185.Google ScholarCross Ref
- Wanfang Xie, Peter Xiaoping Liu, and Minhua Zheng. 2020. Moving object segmentation and detection for robust RGBD-SLAM in dynamic environments. IEEE Transactions on Instrumentation and Measurement 70 (2020), 1–8.Google Scholar
- Chao Yu, Zuxin Liu, Xin-Jun Liu, Fugui Xie, Yi Yang, Qi Wei, and Qiao Fei. 2018. DS-SLAM: A semantic visual SLAM towards dynamic environments. In 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 1168–1174.Google ScholarDigital Library
- Jun Zhang, Mina Henein, Robert Mahony, and Viorela Ila. 2020. VDO-SLAM: A Visual Dynamic Object-aware SLAM System. (2020).Google Scholar
- Ji Zhang and Sanjiv Singh. 2014. LOAM: Lidar odometry and mapping in real-time.. In Robotics: Science and systems, Vol. 2. Berkeley, CA, 1–9.Google Scholar
- Xinguang Zhang, Ruidong Zhang, and Xiankun Wang. 2022. Visual SLAM Mapping Based on YOLOv5 in Dynamic Scenes. Applied Sciences 12, 22 (2022), 11548.Google ScholarCross Ref
- Fangwei Zhong, Sheng Wang, Ziqi Zhang, and Yizhou Wang. 2018. Detect-SLAM: Making object detection and SLAM mutually beneficial. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 1001–1010.Google ScholarCross Ref
Index Terms
- Dynamic Scene Vision SLAM Based on Target Detection in RGB-D Images
Recommendations
Vision-based Moving Target Tracking of Certain Target for Quadruped Robots
AbstractDue to the high flexibility of quadruped robots compared with some traditional robots, it has become an important branch in the field of mobile robot research. Target detection and tracking technology is important for the environment perception ...
SLAM system based on improved DeepLabv3+ semantic network model
ICCIP '23: Proceedings of the 2023 9th International Conference on Communication and Information ProcessingMost of visual synchronous localization and mapping (VSLAM) algorithms are designed based on static scenes and the influence of moving objects in the scene can not be ignored. Due to the presence of moving objects in real scenes, the feature points of ...
Improving target detection by coupling it with tracking
Target detection and tracking represent two fundamental steps in automatic video-based surveillance systems where the goal is to provide intelligent recognition capabilities by analyzing target behavior. This paper presents a framework for video-based ...
Comments