Abstract
When the existing visual SLAM (simultaneous localization and mapping) algorithms are applied to dynamic environments, the pose error estimated by the system often increases sharply, or even the algorithm fails due to the interference of dynamic objects. To adapt to dynamic scenes, a dynamic object processing part needs to be added to the system. However, some existing processing methods lead to reduced real-time performance, which is not conducive to the real-time localization and navigation of mobile robots. To solve the above problems, an RGB-D SLAM system is proposed in this paper for indoor dynamic environments. The system designs an adaptive semantic segmentation tracking algorithm to meet the requirements of localization accuracy and real-time performance in dynamic scenes. First, a lightweight semantic segmentation network is used to provide a priori information about the object. According to this prior information and the motion state of the object in the previous scene, each feature point is assigned a motion level and is classified as a static point, movable static point, or dynamic point. Then, whether the current frame needs semantic segmentation is adaptively determined according to the motion level information of the feature points. Some appropriate feature points (static points) are selected for initial pose estimation, and then, secondary optimization of the pose is performed according to the results of weighted static constraints. In order to verify the effectiveness of the proposed algorithm, experiments are carried out on the TUM RGB-D dynamic scene dataset and compared with ORB-SLAM2 and other SLAM algorithms for dynamic environments. The results show that the proposed algorithm performs well on most datasets, and the positioning accuracy in indoor dynamic environments can be improved by 90.57% compared with the ORB-SLAM2 algorithm. In addition, a 3D semantic map of static backgrounds in dynamic scenes has been established, using dense point cloud maps to visualize 3D scene information, and incorporating semantic information to label objects in the scene, to guide advanced tasks such as robot navigation and enhance the usability of the system.
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the corresponding author, upon reasonable request.
References
Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans. Rob. 31(5), 1147–1163 (2015)
Mur-Artal, R., Tardos, J.D.: ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Trans. Rob. 33(5), 1255–1262 (2017)
Elvira, R., Tardós J.D., Montiel J.M.M.: ORBSLAM-Atlas: a robust and accurate multi-map system. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 6253-6259. IEEE, Piscataway, USA (2019)
Qin, T., Li, P.L., Shen, S.J.: VINS-Mono: a robust and versatile monocular visual-inertial state estimator. IEEE Trans. Robot. 34(4), 1004–1020 (2018)
Forster, C, Pizzoli, M., Scaramuzza, D.: SVO: fast semi-direct monocular visual odometry. In: IEEE International Conference on Robotics and Automation, pp. 15–22. IEEE, Piscataway, USA (2014)
Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 611–625 (2018)
Wang, Y.B., Huang, S.D.: Towards dense moving object segmentation based robust dense RGB-D SLAM in dynamic scenarios. In: 13th International Conference on Control, Automation, Robotics & Vision, pp. 1841–1846. IEEE, Piscataway, USA (2014)
Sun, Y.X., Liu, M., Meng, M.Q.H.: Improving RGB-D SLAM in dynamic environments: a motion removal approach. Robot. Auton. Syst. 89, 110–122 (2017)
Lin, Z.L., Zhang, G.L., Yao, E.L., et al.: Stereo visual odometry based on motion object detection in the dynamic scene. Acta Opt. Sin. 37(11), 187–195 (2017)
Chen, L., Fan, L., Xie, G.D., et al.: Moving object detection from consecutive stereo pairs using slanted plane smoothing. IEEE Trans. Intell. Transp. Syst. 18(11), 3093–3102 (2017)
Wang, R.Z., Wan, W.H., Wang, Y.K., et al.: A new RGB-D SLAM method with moving object detection for dynamic indoor scenes. Remote Sens. 11(10), 1143 (2019). https://doi.org/10.3390/rs11101143
Zhang, H.J., Fang, Z.J., Yang, G.L.: RGB-D visual odometry in dynamic environments using line features. Robot 41(1), 75–82 (2019)
Wei, T., Li, X.: Binocular vision SLAM algorithm based on dynamic region elimination in dynamic environment. Robot 42(3), 82–91 (2020)
Bojko, A., Dupont, R., Tamaazousti, M., et al.: Learning to segment dynamic objects using SLAM outliers. In: 25th International Conference on Pattern Recognition, pp. 9780–9787. IEEE, Piscataway, USA (2021)
Bescos, B., Cadena, C., Neira, J.: Empty cities: a dynamic-object invariant space for visual SLAM. IEEE Trans. Robot. 37(2), 433–451 (2021)
Bao, R.Q., Komatsu, R., Miyagusuku, R., et al.: Stereo camera visual SLAM with hierarchical masking and motion state classification at outdoor construction sites containing large dynamic objects. Adv. Robot. 35(4), 228–241 (2021)
Bescos, B., Facil, J.M., Civera, J., et al.: DynaSLAM: tracking, mapping, and inpainting in dynamic scenes. IEEE Robot. Autom. Lett. 3(4), 4076–4083 (2018)
He, K.M., Zhang, X.Y., Ren, S.Q., et al.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE, Piscataway, USA (2016)
Yu, C., Liu, Z.X., Liu, X.J., et al. DS-SLAM: a semantic visual SLAM towards dynamic environments. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1168–1174. IEEE, Piscataway, USA (2018)
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Xiao, L.H., Wang, J.G., Qiu, X.S., et al.: Dynamic-SLAM: semantic monocular visual localization and mapping based on deep learning in dynamic environment. Robot. Auton. Syst. 117, 1–16 (2019)
Liu, W., Anguelov, D., Erhan, D,. et al.: SSD: single shot multibox detector. In: European Conference on Computer Vision, pp 21–37. Springer, Cham, Switzerland (2016)
Zhong, F.W., Wang, S., Zhang, Z.Q., et al.: Detect-SLAM: making object detection and SLAM mutually beneficial. In: IEEE Winter Conference on Applications of Computer Vision, pp. 1001–1010. IEEE, Piscataway, USA (2018)
Yuan, X., Chen, S.: SaD-SLAM: a visual SLAM based on semantic and depth information. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4930–4935. IEEE, Piscataway, USA (2020)
Zhang, J., Henein, M., Mahony, R., et al.: VDO-SLAM: a visual dynamic object-aware SLAM system. (2020–05–22) [2021–08–01]. https://arxiv.org/abs/2005.11052.
Nekrasov, V., Shen, C., Reid, I.: Light-weight RefineNet for real-time semantic segmentation. (2018–10–08) [2021–08- 01]. https://arxiv.org/abs/1810.03272
Sturm, J., Engelhard, N., Endres, F., et al.: A benchmark for the evaluation of RGB-D SLAM systems. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 573–580. IEEE, Piscataway, USA (2012)
Ai, Q.L., Liu, G.J., Xu, Q.N.: An RGB-D SLAM algorithm for robot based on the improved geometric and motion constraints in dynamic environment. Robot 43(2), 167–176 (2021)
Chang, Q., et al.: Efficient stereo matching on embedded GPUs with zero-means cross correlation. J. Syst. Archit. 123, 102366 (2022)
Chen, G., et al.: GPU-accelerated real-time stereo estimation with binary neural network. IEEE Trans. Parallel Distrib. Syst. 31(12), 2896–2907 (2020)
Author information
Authors and Affiliations
Contributions
LZ wrote the main manuscipt text, the research was conducted under the guidance of WS. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wei, S., Li, Z. An RGB-D SLAM algorithm based on adaptive semantic segmentation in dynamic environment. J Real-Time Image Proc 20, 85 (2023). https://doi.org/10.1007/s11554-023-01343-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11554-023-01343-2