Skip to main content

RSV-SLAM: Toward Real-Time Semantic Visual SLAM in Indoor Dynamic Environments

  • Conference paper
  • First Online:
Intelligent Systems and Applications (IntelliSys 2023)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 823))

Included in the following conference series:

  • 311 Accesses

Abstract

Simultaneous Localization and Mapping (SLAM) plays an important role in many robotics fields, including social robots. Majority of the existing visual SLAM methods rely on a static world assumption and fail in dynamic environments. In this paper, we proposed a real-time semantic RGB-D SLAM system for dynamic environments that is capable of detecting moving objects and maintaining a static map for robust camera tracking. The proposed model is to eliminate the influence of dynamic objects by introducing deep learning-based semantic information to SLAM systems. Furthermore, we augment the semantic segmentation process using an Extended Kalman filter module to detect dynamic objects that are temporarily idle. We have also implemented a generative network to fill in the missing regions of input images belonging to dynamic objects. This highly modular framework has been implemented on the ROS platform and can achieve around 22 fps on a GTX1080. Benchmarking the developed pipeline on dynamic sequences from the TUM dataset suggests that the proposed approach achieves competitive localization error compared to the state-of-the-art methods while operating in near real-time. The source code is publicly available at https://github.com/mobiiin/rsv_slam.

Mobin Habibpour and Alireza Nemati contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Cadena, C., et al.: Past, present, and future of simultaneous localization and mapping: toward the robust-perception age. IEEE Trans. Robot. 32(6), 1309–1332 (2016). https://doi.org/10.1109/TRO.2016.2624754

    Article  Google Scholar 

  2. Yan, F., Copeland, R., Brittain, H.G.: LSD-SLAM: large-scale direct monocular SLAM. ECCV 72(C), 211–216 (2014). https://doi.org/10.1016/S0020-1693(00)81721-1

  3. Mur-Artal, R., Tardos, J.D.: ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Trans. Robot. 33(5), 1255–1262 (2017). https://doi.org/10.1109/TRO.2017.2705103

    Article  Google Scholar 

  4. Gomez-ojeda, R., Zuñiga-noël, D., Moreno, F., Apr, C.V.: PL-SLAM : a Stereo SLAM system through the combination of points and line segments, pp. 1–13

    Google Scholar 

  5. Labbe, M.: RTAB-map as an open-source lidar and visual SLAM library for large-scale and long-term online operation. J. Field Robot. 36(2), 416–446 (2019)

    Article  Google Scholar 

  6. Yu, C., et al.: Ds-Slam a semantic visual SLAM towards dynamic environments (2018)

    Google Scholar 

  7. Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D.: A benchmark for the evaluation of RGB-D SLAM systems. In: IEEE International Conference Intelligence Robotic Systems, pp. 573–580 (2012). https://doi.org/10.1109/IROS.2012.6385773

  8. Quigley, M., et al.: ROS: an open-source robot operating system. In: Work. IEEE International Conference Robotic Automation, no. Figure 1, pp. 4754–4759 (2009). https://doi.org/10.1109/IECON.2015.7392843

  9. Liu, Y., Miura, J.U.N.: RDS-SLAM : real-time dynamic SLAM using semantic segmentation methods, vol. 9 (2021). https://doi.org/10.1109/ACCESS.2021.3050617

  10. Sun, Y., Liu, M., Meng, M.Q.H.: Improving RGB-D SLAM in dynamic environments: a motion removal approach. Robot. Auton. Syst. 89, 110–122 (2017). https://doi.org/10.1016/j.robot.2016.11.012

    Article  Google Scholar 

  11. Kim, D.H., Kim, J.H.: Effective background model-based RGB-D dense visual odometry in a dynamic environment. IEEE Trans. Robot. 32(6), 1565–1573 (2016). https://doi.org/10.1109/TRO.2016.2609395

    Article  Google Scholar 

  12. Scona, R., Jaimez, M., Petillot, Y.R., Fallon, M., Cremers, D.: StaticFusion: background reconstruction for dense RGB-D SLAM in dynamic environments. In: Proceedings of IEEE International Conference Robotic Automation, pp. 3849–3856 (2018). https://doi.org/10.1109/ICRA.2018.8460681

  13. Zhang, T., Zhang, H., Li, Y., Nakamura, Y., Zhang, L.: FlowFusion: dynamic dense RGB-D SLAM based on optical flow In: Proceedings of IEEE International Conference Robotic Automation, pp. 7322–7328 (2020). https://doi.org/10.1109/ICRA40945.2020.9197349

  14. Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017). https://doi.org/10.1109/TPAMI.2016.2644615

    Article  Google Scholar 

  15. Bescos, B., Facil, J.M., Civera, J., Neira, J.: DynaSLAM: tracking, mapping, and inpainting in dynamic scenes. IEEE Robot. Autom. Lett. 3(4), 4076–4083 (2018). https://doi.org/10.1109/LRA.2018.2860039

    Article  Google Scholar 

  16. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 386–397 (2020). https://doi.org/10.1109/TPAMI.2018.2844175

    Article  Google Scholar 

  17. Bescos, B., Campos, C., Tardos, J.D., Neira, J.: DynaSLAM II: tightly-coupled multi-object tracking and SLAM. IEEE Robot. Autom. Lett. 6(3), 5191–5198 (2021). https://doi.org/10.1109/LRA.2021.3068640

    Article  Google Scholar 

  18. Xu, B., Li, W., Tzoumanikas, D., Bloesch, M., Davison, A., Leutenegger, S.: MID-fusion: Octree-based object-level multi-instance dynamic SLAM. In: Proceedings of IEEE International Conference Robotic Automation, vol. 2019-May, pp. 5231–5237 (2019). https://doi.org/10.1109/ICRA.2019.8794371

  19. Henein, M., Zhang, J., Mahony, R., Ila, V.: Dynamic SLAM: the need for speed (2020)

    Google Scholar 

  20. Liu, H., Soto, R.A.R., Xiao, F., Lee, Y.J.: YolactEdge: real-time instance segmentation on the edge (2021)

    Google Scholar 

  21. Patil, P.W., Biradar, K.M., Dudhane, A., Murala, S.: An end-to-end edge aggregation network for moving object segmentation. In: Proceedings of IEEE Computer Society Conference Computer Vision Pattern Recognition, pp. 8146–8155 (2020). https://doi.org/10.1109/CVPR42600.2020.00817

  22. Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J., S.G. Limited: Pyramid scene parsing network—PSPNET. In: IEEE Conference Computer Vision. Pattern Recognition (2017)

    Google Scholar 

  23. Lin, T.Y., et al.: Microsoft COCO: Common objects in context. Lecture Notes Computer Science (including Subseries Lecture Notes Artificial Intelligence. Lecture Notes Bioinformatics), vol. 8693 LNCS, no. PART 5, pp. 740–755 (2014). https://doi.org/10.1007/978-3-319-10602-1_48

  24. Vincent, J., Labbe, M., Lauzon, J.S., Grondin, F., Comtois-Rivet, P.M., Michaud, F.: Dynamic object tracking and masking for visual SLAM. In: IEEE International Conference Intelligence Robotic Systems, pp. 4974–4979 (2020). https://doi.org/10.1109/IROS45743.2020.9340958

  25. Nazeri, K., Ng, E., Joseph, T., Qureshi, F.Z., Ebrahimi, M.: EdgeConnect: generative image inpainting with adversarial edge learning (2019)

    Google Scholar 

  26. Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.: Free-form image inpainting with gated convolution. In: Proceedings of IEEE International Conference Computer Vision, vol. 2019-October, pp. 4470–4479 (2019). https://doi.org/10.1109/ICCV.2019.00457

  27. Zeng, Y., Lin, Z., Lu, H., Patel, V.M.: CR-Fill: generative image inpainting with auxiliary contextual reconstruction, pp. 14144–14153 (2022). https://doi.org/10.1109/iccv48922.2021.01390

  28. Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 Million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452–1464 (2018). https://doi.org/10.1109/TPAMI.2017.2723009

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Alireza Nemati or Shima Nazari .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Habibpour, M., Nemati, A., Meghdari, A., Taheri, A., Nazari, S. (2024). RSV-SLAM: Toward Real-Time Semantic Visual SLAM in Indoor Dynamic Environments. In: Arai, K. (eds) Intelligent Systems and Applications. IntelliSys 2023. Lecture Notes in Networks and Systems, vol 823. Springer, Cham. https://doi.org/10.1007/978-3-031-47724-9_55

Download citation

Publish with us

Policies and ethics