Abstract
Simultaneous Localization and Mapping (SLAM) plays an important role in many robotics fields, including social robots. Majority of the existing visual SLAM methods rely on a static world assumption and fail in dynamic environments. In this paper, we proposed a real-time semantic RGB-D SLAM system for dynamic environments that is capable of detecting moving objects and maintaining a static map for robust camera tracking. The proposed model is to eliminate the influence of dynamic objects by introducing deep learning-based semantic information to SLAM systems. Furthermore, we augment the semantic segmentation process using an Extended Kalman filter module to detect dynamic objects that are temporarily idle. We have also implemented a generative network to fill in the missing regions of input images belonging to dynamic objects. This highly modular framework has been implemented on the ROS platform and can achieve around 22 fps on a GTX1080. Benchmarking the developed pipeline on dynamic sequences from the TUM dataset suggests that the proposed approach achieves competitive localization error compared to the state-of-the-art methods while operating in near real-time. The source code is publicly available at https://github.com/mobiiin/rsv_slam.
Mobin Habibpour and Alireza Nemati contributed equally to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cadena, C., et al.: Past, present, and future of simultaneous localization and mapping: toward the robust-perception age. IEEE Trans. Robot. 32(6), 1309–1332 (2016). https://doi.org/10.1109/TRO.2016.2624754
Yan, F., Copeland, R., Brittain, H.G.: LSD-SLAM: large-scale direct monocular SLAM. ECCV 72(C), 211–216 (2014). https://doi.org/10.1016/S0020-1693(00)81721-1
Mur-Artal, R., Tardos, J.D.: ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Trans. Robot. 33(5), 1255–1262 (2017). https://doi.org/10.1109/TRO.2017.2705103
Gomez-ojeda, R., Zuñiga-noël, D., Moreno, F., Apr, C.V.: PL-SLAM : a Stereo SLAM system through the combination of points and line segments, pp. 1–13
Labbe, M.: RTAB-map as an open-source lidar and visual SLAM library for large-scale and long-term online operation. J. Field Robot. 36(2), 416–446 (2019)
Yu, C., et al.: Ds-Slam a semantic visual SLAM towards dynamic environments (2018)
Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D.: A benchmark for the evaluation of RGB-D SLAM systems. In: IEEE International Conference Intelligence Robotic Systems, pp. 573–580 (2012). https://doi.org/10.1109/IROS.2012.6385773
Quigley, M., et al.: ROS: an open-source robot operating system. In: Work. IEEE International Conference Robotic Automation, no. Figure 1, pp. 4754–4759 (2009). https://doi.org/10.1109/IECON.2015.7392843
Liu, Y., Miura, J.U.N.: RDS-SLAM : real-time dynamic SLAM using semantic segmentation methods, vol. 9 (2021). https://doi.org/10.1109/ACCESS.2021.3050617
Sun, Y., Liu, M., Meng, M.Q.H.: Improving RGB-D SLAM in dynamic environments: a motion removal approach. Robot. Auton. Syst. 89, 110–122 (2017). https://doi.org/10.1016/j.robot.2016.11.012
Kim, D.H., Kim, J.H.: Effective background model-based RGB-D dense visual odometry in a dynamic environment. IEEE Trans. Robot. 32(6), 1565–1573 (2016). https://doi.org/10.1109/TRO.2016.2609395
Scona, R., Jaimez, M., Petillot, Y.R., Fallon, M., Cremers, D.: StaticFusion: background reconstruction for dense RGB-D SLAM in dynamic environments. In: Proceedings of IEEE International Conference Robotic Automation, pp. 3849–3856 (2018). https://doi.org/10.1109/ICRA.2018.8460681
Zhang, T., Zhang, H., Li, Y., Nakamura, Y., Zhang, L.: FlowFusion: dynamic dense RGB-D SLAM based on optical flow In: Proceedings of IEEE International Conference Robotic Automation, pp. 7322–7328 (2020). https://doi.org/10.1109/ICRA40945.2020.9197349
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017). https://doi.org/10.1109/TPAMI.2016.2644615
Bescos, B., Facil, J.M., Civera, J., Neira, J.: DynaSLAM: tracking, mapping, and inpainting in dynamic scenes. IEEE Robot. Autom. Lett. 3(4), 4076–4083 (2018). https://doi.org/10.1109/LRA.2018.2860039
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 386–397 (2020). https://doi.org/10.1109/TPAMI.2018.2844175
Bescos, B., Campos, C., Tardos, J.D., Neira, J.: DynaSLAM II: tightly-coupled multi-object tracking and SLAM. IEEE Robot. Autom. Lett. 6(3), 5191–5198 (2021). https://doi.org/10.1109/LRA.2021.3068640
Xu, B., Li, W., Tzoumanikas, D., Bloesch, M., Davison, A., Leutenegger, S.: MID-fusion: Octree-based object-level multi-instance dynamic SLAM. In: Proceedings of IEEE International Conference Robotic Automation, vol. 2019-May, pp. 5231–5237 (2019). https://doi.org/10.1109/ICRA.2019.8794371
Henein, M., Zhang, J., Mahony, R., Ila, V.: Dynamic SLAM: the need for speed (2020)
Liu, H., Soto, R.A.R., Xiao, F., Lee, Y.J.: YolactEdge: real-time instance segmentation on the edge (2021)
Patil, P.W., Biradar, K.M., Dudhane, A., Murala, S.: An end-to-end edge aggregation network for moving object segmentation. In: Proceedings of IEEE Computer Society Conference Computer Vision Pattern Recognition, pp. 8146–8155 (2020). https://doi.org/10.1109/CVPR42600.2020.00817
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J., S.G. Limited: Pyramid scene parsing network—PSPNET. In: IEEE Conference Computer Vision. Pattern Recognition (2017)
Lin, T.Y., et al.: Microsoft COCO: Common objects in context. Lecture Notes Computer Science (including Subseries Lecture Notes Artificial Intelligence. Lecture Notes Bioinformatics), vol. 8693 LNCS, no. PART 5, pp. 740–755 (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Vincent, J., Labbe, M., Lauzon, J.S., Grondin, F., Comtois-Rivet, P.M., Michaud, F.: Dynamic object tracking and masking for visual SLAM. In: IEEE International Conference Intelligence Robotic Systems, pp. 4974–4979 (2020). https://doi.org/10.1109/IROS45743.2020.9340958
Nazeri, K., Ng, E., Joseph, T., Qureshi, F.Z., Ebrahimi, M.: EdgeConnect: generative image inpainting with adversarial edge learning (2019)
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.: Free-form image inpainting with gated convolution. In: Proceedings of IEEE International Conference Computer Vision, vol. 2019-October, pp. 4470–4479 (2019). https://doi.org/10.1109/ICCV.2019.00457
Zeng, Y., Lin, Z., Lu, H., Patel, V.M.: CR-Fill: generative image inpainting with auxiliary contextual reconstruction, pp. 14144–14153 (2022). https://doi.org/10.1109/iccv48922.2021.01390
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 Million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452–1464 (2018). https://doi.org/10.1109/TPAMI.2017.2723009
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Habibpour, M., Nemati, A., Meghdari, A., Taheri, A., Nazari, S. (2024). RSV-SLAM: Toward Real-Time Semantic Visual SLAM in Indoor Dynamic Environments. In: Arai, K. (eds) Intelligent Systems and Applications. IntelliSys 2023. Lecture Notes in Networks and Systems, vol 823. Springer, Cham. https://doi.org/10.1007/978-3-031-47724-9_55
Download citation
DOI: https://doi.org/10.1007/978-3-031-47724-9_55
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47723-2
Online ISBN: 978-3-031-47724-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)