Skip to main content

AGAM-SLAM: An Adaptive Dynamic Scene Semantic SLAM Method Based on GAM

  • Conference paper
  • First Online:
Advanced Intelligent Computing Technology and Applications (ICIC 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14090))

Included in the following conference series:

  • 854 Accesses

Abstract

With rapid developments in the fields of autonomous driving, robot navigation, and augmented reality, visual SLAM technology has become one of the core key technologies. While VSLAM systems perform more consistently in static scenes, introducing dynamic objects such as people, vehicles, and animals into the scene makes reliable map building and localization more difficult, and accurate trajectory estimation more challenging to achieve. In this paper, we propose a semantic VSLAM system based on the Global attention mechanism (GAM) and adaptive thresholding. First, GAM improves the segmentation accuracy of the Mask R-CNN network model for dynamic objects and eliminates the influence of dynamic objects on the VSLAM system. In addition, adaptive thresholding generates adaptive factors based on the number of key points extracted in the scene and dynamically adjusts the FAST threshold, which enables more stable extraction of feature points in dynamic scenes. We have verified our approach on the TUM public dataset, and compared with the DynaSLAM method. The absolute trajectory error (ATE) and relative trajectory error (RPE) are reduced to some extent on its dataset. Especially on the W_rpy dataset, the accuracy of our method is improved by 38.78%. The experimental results show that our proposed method can significantly improve the overall performance of the system in highly dynamic environments.

This work was supported in part by the Key Research and Development Project of Hainan Province (ZDYF2022GXJS348, ZDYF2022SHFZ039), the Hainan Province Natural Science Foundation (623RC446) and the National Natural Science Foun- dation of China (62161010, 61963012). The authors would like to thank the referees for their constructive suggestions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chang ,Y., Tian, Y., How, J.P., Carlone, L.: Kimera-multi: a system for distributed multi-robot metric-semantic simultaneous localization and mapping. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 11210−11218 (IEEE)

    Google Scholar 

  2. Cheng, J., Zhang, L., Chen, Q., Hu, X., Cai, J.: A review of visual slam methods for autonomous driving vehicles. Appl. Artif. Intell. 114, 104992 (2022)

    Article  Google Scholar 

  3. Jinyu, L., Bangbang, Y., Danpeng, C., Nan, W., Guofeng, Z., Hujun B.: Survey and evaluation of monocular visual-inertial slam algorithms for augmented reality. Virt. Real. Intell. Hardw. 1(4), 386 410 (2019)

    Google Scholar 

  4. Liu, Y., Miura J.: Rds-slam: Real-time dynamic slam using semantic segmentation methods. IEEE Access 9, 23772−23785 (2021)

    Google Scholar 

  5. Wang, H., Ko, J.Y., Xie, L.: Multi-modal semantic slam for complex dynamic environments (2022)

    Google Scholar 

  6. Li, A., Wang, J., Xu, M., Chen, Z.: DP-SLAM: A visual slam with moving probability towards dynamic environments. Science 556, 128−142 (2021)

    Google Scholar 

  7. Mur-Artal, R., Montiel, J.M., Tardos, J.D.: ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans. Robot. 31(5), 1147−1163 (2015)

    Google Scholar 

  8. Mur-Artal, R., Tardós, J.D.: Orb-slam2: an open-source slam system for monocular, stereo, and RGB-D cameras. IEEE Trans. Robot. 33(5), 1255−1262 (2017)

    Google Scholar 

  9. Yu, C., et al.: DS- SLAM: a semantic visual slam towards dynamic environments. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1168−1174. IEEE (2018)

    Google Scholar 

  10. Zhao, C., Sun, L., Purkait, P., Duckett, T., Stolkin, R.: Learning monocular visual odometry with dense 3d mapping from dense 3d ow. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6864 6871. IEEE (2018)

    Google Scholar 

  11. Konda, K.R., Memisevic, R.: Learning visual odometry with a convolutional network. VISAPP 486(490), 2015 (2015)

    Google Scholar 

  12. Wu, W., Guo, L., Gao, H., You, Z., Liu, Y., Chen, Z.: Yolo-slam: a semantic slam system to- wards dynamic environment with geometric constraint. Neural Comput. Appl. 1−16 (2022)

    Google Scholar 

  13. Redmon, J., Farhadi, A.: Yolov3: An incremental improvement (2018)

    Google Scholar 

  14. Bescos, B., Fácil, J.M., Civera, J., Neira, J.: DynaSLAM: Tracking, mapping, and inpainting in dynamic scenes. IEEE Robot. Autom. Lett. 3(4), 4076−4083 (2018)

    Google Scholar 

  15. Alismail, H., Kaess, M., Browning, B., Lucey, S.: Direct visual odometry in low light using binary descriptors. IEEE Robot. Autom. Lett. 2(2), 444−451(2016)

    Google Scholar 

  16. Ono, Y., Trulls, E., Fua, P., Yi, K.M.: Learning local features from images. Inf. Process. Syst. Yi. Lf-net. 31 (2018)

    Google Scholar 

  17. Liu, Y., Shao, Z., Hoffmann, N.: Global attention mechanism: retain information to enhance channel-spatial interactions (2021)

    Google Scholar 

  18. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961−2969 (2017)

    Google Scholar 

  19. Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 3−19 (2318)

    Google Scholar 

  20. Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440−1448 (2015)

    Google Scholar 

  21. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770−778 (2016)

    Google Scholar 

  22. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  23. Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D.: A benchmark for the evaluation of RGB-D SLAM systems. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 573 580. IEEE (2012)

    Google Scholar 

Download references

Acknowledgements

We thank Shenzhen Umouse Technology Development Co., Ltd. For their support in equipments and experimental conditions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhuhua Hu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cai, D., Hu, Z., Li, R., Qi, H., Xiang, Y., Zhao, Y. (2023). AGAM-SLAM: An Adaptive Dynamic Scene Semantic SLAM Method Based on GAM. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science(), vol 14090. Springer, Singapore. https://doi.org/10.1007/978-981-99-4761-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-4761-4_3

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-4760-7

  • Online ISBN: 978-981-99-4761-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics