Skip to main content
Log in

A Semi-Direct Monocular Visual SLAM Algorithm in Complex Environments

  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

A novel monocular visual simultaneous localization and mapping (SLAM) algorithm built on the semi-direct method is proposed to deal with some problems in complex environments, such as low-texture, moving objects and perceptual aliasing. The proposed algorithm takes advantage of direct and feature-based methods. On one hand, a direct method is used to track the camera poses and solve the feature alignment. On the other hand, ORB features in keyframes are extracted and matched for optimization and loop closure. To improve the localization accuracy in dynamic environments, a motion detection module that is robust to illumination change is adopted. In addition, for the sake of resolving the loop closure detection problem in perceptual aliasing scenes, this paper fuses the spatial information between two visual words into the bag of visual words (BoVW) model and employs an improved pyramid term frequency-inverse document frequency (TF-IDF) scoring match scheme. Experimental results prove that the proposed algorithm behaves better performance than ORB-SLAM with regard to overall accuracy and speed in complex environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Younes, G., Asmar, D., Shammas, E., Zelek, J.: Keyframe-based monocular SLAM: design, survey, and future directions. Robot. Auton. Syst. 98, 67–88 (2017)

    Article  Google Scholar 

  2. Hu, H., Sun, H., Ye, P., Jia, Q., Gao, X.: Multiple maps for the feature-based monocular SLAM system. J. Intell. Robot. Syst. 94, 389–404 (2019)

    Article  Google Scholar 

  3. Blösch, M., Weiss, S., Scaramuzza, D., Siegwart, R.: Vision based mav navigation in unknown and unstructured environments. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 21–28 (2010)

  4. Weiss, S., Achtelik, M., Lynen, S., Achtelik, M.: Monocular vision for long-term micro aerial vehicle state estimation: a compendium. J. Field Robot. 30(5), 803–831 (2013)

    Article  Google Scholar 

  5. Kerl, C., Sturm, J., Cremers, D.: Robust odometry estimation for rgb-d cameras. In: IEEE International Conference on Robotics and Automation (ICRA), Karlsruhe, Germany, pp. 3748–3754 (2013)

  6. Meilland, M., Comport, A.: On unifying key-frame and voxel-based dense visual slam at large scales. In: IEEE/RSJ International Conference on Intelligent Robots and Systems. pp. 3677–3683 (2013)

  7. Folkesson, J., Christensen, H.: Closing the Loop with Graphical SLAM. IEEE Trans. Robot. 23(4), 731–741 (2007)

  8. Wen, L., Ray, J.: A pure vision-based topological SLAM system. Int. J. Robot. Res. 31(4), 403–428 (2012)

    Article  Google Scholar 

  9. Li, B., Yang, D., Deng, L.: Visual vocabulary tree with pyramid TF-IDF scoring match scheme for loop closure detection. Acta Automat. Sin. 37(6), 665–673 (2011)

    Google Scholar 

  10. Strasdat, H., Montiel, J.M.M., Davison, A.J.: Visual SLAM: why filter? Image Vis. Comput. 30(2), 65–77 (2012)

    Article  Google Scholar 

  11. Klein, G., Murray, D.: Parallel tracking and mapping for small ar workspaces. In: The 6th IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR 2007), Nara, Japan, pp. 225–234 (2007)

  12. Mur-Artal, R., Montiel, J.M.M., Tardos, J.D: ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Trans. Robot. 31(5), 1147–1163(2015)

  13. Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: An efficient alternative to SIFT or SURF. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2564–2571 (2012)

  14. Engel, J., Sch, T., Cremers, D.: LSD-SLAM: large-Scale Direct Monocular SLAM. In: 13th European Conference on Computer Vision (ECCV), Zurich, Switzerland, pp. 834–849 (2014)

  15. Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), (2017)

  16. Forster, C., Pizzoli, M., Scaramuzza, D.: SVO: fast semidirect monocular visual odometry. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 15–22 (2014)

  17. Yang, N., Wang, R., Gao, X., Cremers, D.: Challenges in monocular visual odometry: photometric calibration, motion bias, and rolling shutter effect. Robot. Auto. Lett. 3(4), 2878–2885 (2018)

    Article  Google Scholar 

  18. Li, S., Zhang T., X., Gao, X.: Semi-direct monocular visual and visual-inertial SLAM with loop closure detection. Robot. Auton. Syst. 112, 201–210 (2019)

  19. Lee, L.H., Civera, J.: Loosely-coupled semi-direct monocular SLAM. Robot. Auto. Lett. 4(2), 399–406 (2019)

    Article  Google Scholar 

  20. Wang, Y., Huang, S.: Towards dense moving object segmentation based robust dense RGBD SLAM in dynamic scenarios. In: The 13th IEEE International Conference on Control Automation Robotics and Vision (ICARCV). pp. 1841–1846 (2014)

  21. Bescos, B., Fácil, J.M., Civera, J., Neira, J.: DynaSLAM: Tracking,Mapping, and Inpainting in Dynamic Scenes. IEEE Robotics and Automation Letters, 3(4), 4076–4083 (2018)

  22. Yu, C., Liu, Z., Liu, X., et al.: DS-SLAM: a semantic visual SLAM towards dynamic environments. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2018)

  23. Angeli, A., Filliat, D., Doncieux, S., Meyer, J.: Fast and incremental method for loop-closure detection using bags of visual words. IEEE Trans. Robot. 24(5), 1027–1037 (2008)

    Article  Google Scholar 

  24. Nicosevici, T., Garcia, R.: Automatic visual bag-of-words for online robot navigation and mapping. IEEE Trans. Robot. 28(4), 886–898 (2012)

    Article  Google Scholar 

  25. Garcia-Fidalgo, E., Ortiz A.: Methods for appearance-based loop closure detection - applications to topological mapping and image mosaicking. Springer International Publishing (2018)

  26. Bosch, A., Zisserman, A., Munoz., X.: Representing shape with a spatial pyramid kernel. In : ACM International Conference on Image and Video Retrieval. 401–408 (2007)

  27. Mur-Artal, R., Tardos, J.D.: Orb-slam2: an open-source slam system for monocular, stereo and rgb-d cameras. IEEE Trans. Robot. 33(5), 1255–1262 (2017)

    Article  Google Scholar 

  28. Yun, K., Choi, J.Y.: Robust and fast moving object detection in a non-stationary camera via foreground probability based sampling. In: IEEE International Conference on Image Processing (2015)

  29. Moo, Y.K., Yun, K., Kim, W., et al: Detection of moving objects with non-stationary cameras in 5.8 ms: Bringing motion detection to your mobile device. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops. pp. 27–34 (2013)

  30. Chang, H.J., Jeong, H., Choi, J.Y.: Active attentional sampling for speed-up of background subtraction. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)

  31. López-Rubio, F., López-Rubio, E.: Foreground detection for moving cameras with stochastic approximation. Pattern Recogn. Lett. 68, 161–168 (2015)

    Article  Google Scholar 

  32. Arthur, D., Vassilvitskii, S.: K-means++: The advantages of careful seeding. Proceedings of the 18th annual ACM-SIAM symposium on Discrete algorithms. pp. 1027–1035 (2007)

  33. Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: IEEE International Conference on Computer Vision. pp. 1470–1477 (2003)

  34. Robertson., S.: Understanding inverse document frequency: on theoretical arguments for idf. J. Doc.. 60(5), 503–520 (2004)

  35. Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D.: A Benchmark for the Evaluation of RGB-D SLAM Systems. In: IEEE/RSJ International Conference on Intelligent Robots and Systems. pp. 573–580 (2012)

  36. Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)

    Article  Google Scholar 

  37. Cummins, M., Newman. P.: Appearance-only SLAM at large scale with FAB-MAP 2.0. Int. J. Robot. Res. 30(9), 1100–1123 (2010)

  38. Labbé, M., Michaud, F.: Online global loop closure detection for large-scale multi-session graph-based slam. In: IEEE/RSJ International Conference on Intelligent Robots and Systems. pp. 2661–2666 (2014)

Download references

Acknowledgments

This work is supported by National Nature Science Foundation (Grant No.61573100) and NJUPT Program (Grant No. NY219123). We sincerely acknowledge constructive comments from editors and reviewers.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chengzhi Wang.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liang, Z., Wang, C. A Semi-Direct Monocular Visual SLAM Algorithm in Complex Environments. J Intell Robot Syst 101, 25 (2021). https://doi.org/10.1007/s10846-020-01297-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10846-020-01297-8

Keywords

Navigation