skip to main content
10.1145/3638884.3638928acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccipConference Proceedingsconference-collections
research-article

SLAM system based on improved DeepLabv3+ semantic network model

Published:23 April 2024Publication History

ABSTRACT

Most of visual synchronous localization and mapping (VSLAM) algorithms are designed based on static scenes and the influence of moving objects in the scene can not be ignored. Due to the presence of moving objects in real scenes, the feature points of the visual odometer is usually easy to be mismatched, which affects the positioning and mapping accuracy of the SLAM system, resulting in the low robustness in practical applications. AVSLAM algorithm based on the ORB-SLAM2 based on deep learning is proposed in this paper, whichoutperforms on identifying and removing dynamic feature points in the tracking procedure of the SLAM system operated in dynamic environments. The local optical flow method is combined to determine the dynamic objects in the environment. In addition, an improved model based on DeepLabv3+ is introduced, which can simultaneously perform object detection and semantic segmentation for accurately removing dynamic feature points. Finally, the accuracy and robustness of the semantic segmentation model and improved algorithms are evaluated after the Cityscapes and TUM datasets are adopted. The experimental results on the TUM datasets show that in high dynamic scenarios, proposed algorithm improves the accuracy of trajectory estimation by 60.69% and 84.91%, respectively compared to ORB-SLAM2 method. The groups of experiments show that the SLAM algorithm based on the improved DeepLabv3+Semantic network model significantly improves the robustness and accuracy of the VSLAM system in the dynamic environment.

References

  1. Davison, Andrew J., “ MonoSLAM: Real-time single camera SLAM.” IEEE transactions on pattern analysis and machine intelligence 29.6 (2007): 1052-1067.Google ScholarGoogle Scholar
  2. Mur-Artal, Raúl, J. M. M. Montiel, and J. D. Tardós.“ ORB-SLAM: A Versatile and Accurate Monocular SLAM System.” IEEE Transactions on Robotics 31.5(2015):1147-1163.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Mur-Artal, Raul, and Juan D. Tardós. “ Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras.” IEEE Transactions on Robotics 33.5 (2017): 1255-1262.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Chen L C, Papandreou G, Kokkinos I, “ Semantic image segmentation with deep convolutional nets and fully connected CRFs [EB/OL] .” (2016-06-07) [2020-10-15] . https://arxiv.org/abs/1412.7062.Google ScholarGoogle Scholar
  5. Chen L C, Papandreou G, Kokkinos I, “ DeepLab semantic image segmentation with deep convolutional nets atrous convolution and fully connected CRFs [J]. ” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40 (4): 834-848.Google ScholarGoogle ScholarCross RefCross Ref
  6. Chen L C, Papandreou G, Schroff F, “ Rethinking atrous convolution for semantic image segmentation [EB/OL] .” (2017-12-05) [2020-10-15]. https://arxiv.org/abs/1706.05587.Google ScholarGoogle Scholar
  7. Chen L C, Zhu Y K, Papandreou G, Schroff F, “ Encoder-decoder with atrous separable convolution for semantic image segmentation[M] ” // Ferrari V, Hebert M, Sminchisescu C, Computer science. Cham : Springer, 2018, 11211 : 833-851.Google ScholarGoogle Scholar
  8. Wu C,Sun J,Wang J, “ Encoding-decoding Network With Pyramid Self-attention Module for Retinal Vessel Segmentation[J].” International Journal of Automation and Computing, 2021, 18(06): 973-980.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Le Cun Y, Bengio Y. “Convolutional networks for images, speech, and time series [J] . ” The Handbook of Brain Theory and Neural Networks, 1995, 3361 (10) : 255-258.Google ScholarGoogle Scholar
  10. Yu F, Koltun V. “ Multi-scale context aggregation by dilated convoluyions [EB/OL] . ” (2016-04-30) [2020-10-15] . https://arxiv.org/abs/1511.07122.Google ScholarGoogle Scholar
  11. Cheng X Y, Zhao L Z, Hu Q, “ Real-time semantic segmentation based on dilated convolution smoothing and lightweight up-sampling [J] . ” Laser & Optoelectronics Progress, 2020, 57 (2) : 021017.Google ScholarGoogle ScholarCross RefCross Ref
  12. He K M, Zhang X Y, Ren S Q, “ Deep residual learning for image recognition [C]” // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE Press, 2016 : 770-778.Google ScholarGoogle ScholarCross RefCross Ref
  13. Lin M, Chen Q, Yan S. “ Network in network [EB / OL]. ” (2014-03-04) [2020-10-15] . https://arxiv.org/abs/1312.4400.Google ScholarGoogle Scholar
  14. Lin T Y, Dollar P, Girshick R, “ Feature pyramid networks for object detection [C] ” // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , July 21-26, 2017, Honolulu, HI, USA. New York : IEEE Press, 2017: 936-944.Google ScholarGoogle ScholarCross RefCross Ref
  15. Cai Y, Huang X G, Zhang Z A, “ Real-time semantic segmentation algorithm based on feature fusion technology [J] .” Laser & Optoelectronics Progress, 2020, 57 (2) : 021011.Atul Adya, Paramvir Bahl, Jitendra Padhye, Alec Wolman, and Lidong Zhou. 2004. A multi-radio unification protocol for IEEE 802.11 wireless networks. In Proceedings of the IEEE 1st International Conference on Broadnets Networks (BroadNets’04) . IEEE, Los Alamitos, CA, 210–217. https://doi.org/10.1109/BROADNETS.2004.8Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. SLAM system based on improved DeepLabv3+ semantic network model

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICCIP '23: Proceedings of the 2023 9th International Conference on Communication and Information Processing
      December 2023
      648 pages
      ISBN:9798400708909
      DOI:10.1145/3638884

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 April 2024

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate61of301submissions,20%
    • Article Metrics

      • Downloads (Last 12 months)4
      • Downloads (Last 6 weeks)3

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format