Skip to main content
Log in

An RGB-D SLAM algorithm based on adaptive semantic segmentation in dynamic environment

  • Research
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

When the existing visual SLAM (simultaneous localization and mapping) algorithms are applied to dynamic environments, the pose error estimated by the system often increases sharply, or even the algorithm fails due to the interference of dynamic objects. To adapt to dynamic scenes, a dynamic object processing part needs to be added to the system. However, some existing processing methods lead to reduced real-time performance, which is not conducive to the real-time localization and navigation of mobile robots. To solve the above problems, an RGB-D SLAM system is proposed in this paper for indoor dynamic environments. The system designs an adaptive semantic segmentation tracking algorithm to meet the requirements of localization accuracy and real-time performance in dynamic scenes. First, a lightweight semantic segmentation network is used to provide a priori information about the object. According to this prior information and the motion state of the object in the previous scene, each feature point is assigned a motion level and is classified as a static point, movable static point, or dynamic point. Then, whether the current frame needs semantic segmentation is adaptively determined according to the motion level information of the feature points. Some appropriate feature points (static points) are selected for initial pose estimation, and then, secondary optimization of the pose is performed according to the results of weighted static constraints. In order to verify the effectiveness of the proposed algorithm, experiments are carried out on the TUM RGB-D dynamic scene dataset and compared with ORB-SLAM2 and other SLAM algorithms for dynamic environments. The results show that the proposed algorithm performs well on most datasets, and the positioning accuracy in indoor dynamic environments can be improved by 90.57% compared with the ORB-SLAM2 algorithm. In addition, a 3D semantic map of static backgrounds in dynamic scenes has been established, using dense point cloud maps to visualize 3D scene information, and incorporating semantic information to label objects in the scene, to guide advanced tasks such as robot navigation and enhance the usability of the system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig.1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

The data that support the findings of this study are available from the corresponding author, upon reasonable request.

References

  1. Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans. Rob. 31(5), 1147–1163 (2015)

    Article  Google Scholar 

  2. Mur-Artal, R., Tardos, J.D.: ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Trans. Rob. 33(5), 1255–1262 (2017)

    Article  Google Scholar 

  3. Elvira, R., Tardós J.D., Montiel J.M.M.: ORBSLAM-Atlas: a robust and accurate multi-map system. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 6253-6259. IEEE, Piscataway, USA (2019)

  4. Qin, T., Li, P.L., Shen, S.J.: VINS-Mono: a robust and versatile monocular visual-inertial state estimator. IEEE Trans. Robot. 34(4), 1004–1020 (2018)

    Article  Google Scholar 

  5. Forster, C, Pizzoli, M., Scaramuzza, D.: SVO: fast semi-direct monocular visual odometry. In: IEEE International Conference on Robotics and Automation, pp. 15–22. IEEE, Piscataway, USA (2014)

  6. Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 611–625 (2018)

    Article  Google Scholar 

  7. Wang, Y.B., Huang, S.D.: Towards dense moving object segmentation based robust dense RGB-D SLAM in dynamic scenarios. In: 13th International Conference on Control, Automation, Robotics & Vision, pp. 1841–1846. IEEE, Piscataway, USA (2014)

  8. Sun, Y.X., Liu, M., Meng, M.Q.H.: Improving RGB-D SLAM in dynamic environments: a motion removal approach. Robot. Auton. Syst. 89, 110–122 (2017)

    Article  Google Scholar 

  9. Lin, Z.L., Zhang, G.L., Yao, E.L., et al.: Stereo visual odometry based on motion object detection in the dynamic scene. Acta Opt. Sin. 37(11), 187–195 (2017)

    Google Scholar 

  10. Chen, L., Fan, L., Xie, G.D., et al.: Moving object detection from consecutive stereo pairs using slanted plane smoothing. IEEE Trans. Intell. Transp. Syst. 18(11), 3093–3102 (2017)

    Article  Google Scholar 

  11. Wang, R.Z., Wan, W.H., Wang, Y.K., et al.: A new RGB-D SLAM method with moving object detection for dynamic indoor scenes. Remote Sens. 11(10), 1143 (2019). https://doi.org/10.3390/rs11101143

    Article  Google Scholar 

  12. Zhang, H.J., Fang, Z.J., Yang, G.L.: RGB-D visual odometry in dynamic environments using line features. Robot 41(1), 75–82 (2019)

    Google Scholar 

  13. Wei, T., Li, X.: Binocular vision SLAM algorithm based on dynamic region elimination in dynamic environment. Robot 42(3), 82–91 (2020)

    Google Scholar 

  14. Bojko, A., Dupont, R., Tamaazousti, M., et al.: Learning to segment dynamic objects using SLAM outliers. In: 25th International Conference on Pattern Recognition, pp. 9780–9787. IEEE, Piscataway, USA (2021)

  15. Bescos, B., Cadena, C., Neira, J.: Empty cities: a dynamic-object invariant space for visual SLAM. IEEE Trans. Robot. 37(2), 433–451 (2021)

    Article  Google Scholar 

  16. Bao, R.Q., Komatsu, R., Miyagusuku, R., et al.: Stereo camera visual SLAM with hierarchical masking and motion state classification at outdoor construction sites containing large dynamic objects. Adv. Robot. 35(4), 228–241 (2021)

    Article  Google Scholar 

  17. Bescos, B., Facil, J.M., Civera, J., et al.: DynaSLAM: tracking, mapping, and inpainting in dynamic scenes. IEEE Robot. Autom. Lett. 3(4), 4076–4083 (2018)

    Article  Google Scholar 

  18. He, K.M., Zhang, X.Y., Ren, S.Q., et al.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE, Piscataway, USA (2016)

  19. Yu, C., Liu, Z.X., Liu, X.J., et al. DS-SLAM: a semantic visual SLAM towards dynamic environments. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1168–1174. IEEE, Piscataway, USA (2018)

  20. Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)

    Article  Google Scholar 

  21. Xiao, L.H., Wang, J.G., Qiu, X.S., et al.: Dynamic-SLAM: semantic monocular visual localization and mapping based on deep learning in dynamic environment. Robot. Auton. Syst. 117, 1–16 (2019)

    Article  Google Scholar 

  22. Liu, W., Anguelov, D., Erhan, D,. et al.: SSD: single shot multibox detector. In: European Conference on Computer Vision, pp 21–37. Springer, Cham, Switzerland (2016)

  23. Zhong, F.W., Wang, S., Zhang, Z.Q., et al.: Detect-SLAM: making object detection and SLAM mutually beneficial. In: IEEE Winter Conference on Applications of Computer Vision, pp. 1001–1010. IEEE, Piscataway, USA (2018)

  24. Yuan, X., Chen, S.: SaD-SLAM: a visual SLAM based on semantic and depth information. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4930–4935. IEEE, Piscataway, USA (2020)

  25. Zhang, J., Henein, M., Mahony, R., et al.: VDO-SLAM: a visual dynamic object-aware SLAM system. (2020–05–22) [2021–08–01]. https://arxiv.org/abs/2005.11052.

  26. Nekrasov, V., Shen, C., Reid, I.: Light-weight RefineNet for real-time semantic segmentation. (2018–10–08) [2021–08- 01]. https://arxiv.org/abs/1810.03272

  27. Sturm, J., Engelhard, N., Endres, F., et al.: A benchmark for the evaluation of RGB-D SLAM systems. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 573–580. IEEE, Piscataway, USA (2012)

  28. Ai, Q.L., Liu, G.J., Xu, Q.N.: An RGB-D SLAM algorithm for robot based on the improved geometric and motion constraints in dynamic environment. Robot 43(2), 167–176 (2021)

    Google Scholar 

  29. Chang, Q., et al.: Efficient stereo matching on embedded GPUs with zero-means cross correlation. J. Syst. Archit. 123, 102366 (2022)

    Article  Google Scholar 

  30. Chen, G., et al.: GPU-accelerated real-time stereo estimation with binary neural network. IEEE Trans. Parallel Distrib. Syst. 31(12), 2896–2907 (2020)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

LZ wrote the main manuscipt text, the research was conducted under the guidance of WS. All authors reviewed the manuscript.

Corresponding author

Correspondence to Song Wei.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wei, S., Li, Z. An RGB-D SLAM algorithm based on adaptive semantic segmentation in dynamic environment. J Real-Time Image Proc 20, 85 (2023). https://doi.org/10.1007/s11554-023-01343-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11554-023-01343-2

Keywords

Navigation