Skip to main content
Log in

Deep Reinforcement Learning Based Loop Closure Detection

  • Regular paper
  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

In this work, we investigate loop closure detection through a deep reinforcement learning approach. The loop closure detection problem correctly identifies a location or area a robot has previously visited. We propose a reward-driven optimization process that strives to learn loop closure detection. We demonstrate the framework in a simulated grid environment that generates observation data for a learning agent. We designed a grid-based environment to simulate indoor environments and train a policy for loop closure detection. A conversion of real-world map and features to the simulated environment is also demonstrated. The learning agent was tested in simulation and indoor lab environments. Our experimental results show that our algorithm can perform loop closure detection effectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping: part I. IEEE Robot. Autom. Mag. 13(2), 99–110 (2006)

    Article  Google Scholar 

  2. Bailey, T., Durrant-Whyte, H.: Simultaneous localization and mapping (slam): Part II. IEEE Robot. Autom. Mag. 13(3), 108–117 (2006)

    Article  Google Scholar 

  3. Ho, K., Newman, P.: Loop closure detection in slam by combining visual and spatial appearance. Robot. Auton. Syst. 54(9), 740–749 (2006)

    Article  Google Scholar 

  4. Thrun, S.: Simultaneous localization and mapping. In: Robotics and Cognitive Approaches to Spatial Mapping, pp 13–41. Springer (2007)

  5. Fuentes-Pacheco, J., Ruiz-Ascencio, J., Rendón-Mancha, J.M.: Visual simultaneous localization and mapping: a survey. Artif. Intell. Rev. 43(1), 55–81 (2015)

    Article  Google Scholar 

  6. Yousif, K., Bab-Hadiashar, A., Hoseinnezhad, R.: An overview to visual odometry and visual slam: Applications to mobile robotics. Intell. Ind. Syst. 1(4), 289–311 (2015)

    Article  Google Scholar 

  7. Iqbal, A., Gans, N.R.: Data association and localization of classified objects in visual slam. J. Intell. Robot. Syst. 100, 113–130 (2020)

    Article  Google Scholar 

  8. Clemente, L.A., Davison, A.J., Reid, I., Neira, J., Tardós, J.D.: Mapping large loops with a single hand-held camera

  9. Williams, B., Smith, P., Reid, I.: Automatic relocalisation for a single-camera simultaneous localisation and mapping system. In: Proceedings IEEE International Conference on Robotics and Automation, p 2007. IEEE (2007)

  10. Košecká, J., Li, F., Yang, X.: Global localization and relative positioning based on scale-invariant keypoints. Robot. Auton. Syst. 52(1), 27–38 (2005)

    Article  Google Scholar 

  11. Ulrich, I., Nourbakhsh, I.: Appearance-based place recognition for topological localization. In: Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No. 00CH37065), vol. 2, pp 1023–1029. IEEE (2000)

  12. Angeli, A., Filliat, D., Doncieux, S., Meyer, J.: Fast and incremental method for loop-closure detection using bags of visual words. IEEE Trans. Robot. 24(5), 1027–1037 (2008). ISSN 1552-3098. https://doi.org/10.1109/TRO.2008.2004514

    Article  Google Scholar 

  13. Kröse, B.J.A., Vlassis, N., Bunschoten, R., Motomura, Y.: A probabilistic model for appearance-based robot localization. Image Vis. Comput. 19(6), 381–391 (2001)

    Article  Google Scholar 

  14. Ramos, F., Upcroft, B., Kumar, S., Durrant-Whyte, H.: A bayesian approach for place recognition. Robot. Auton. Syst. 60(4), 487–497 (2012)

    Article  Google Scholar 

  15. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)

    Article  Google Scholar 

  16. Rublee, E., Rabaud, V., Konolige, K., Bradski, G.R.: Orb: An efficient alternative to sift or surf. In: ICCV, vol. 11, p 2. Citeseer (2011)

  17. Bay, H., Tuytelaars, T., Gool, L.V.: Surf: Speeded up robust features. In: European Conference on Computer Vision, pp 404–417. Springer (2006)

  18. Zhang, Y., Jin, R., Zhou, Z.-H.: Understanding bag-of-words model: a statistical framework. Int. J. Mach. Learn. Cybern. 1(1-4), 43–52 (2010)

    Article  Google Scholar 

  19. Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision, ECCV, vol. 1, pp 1–2. Prague (2004)

  20. Wang, J., Zha, H., Cipolla, R.: Coarse-to-fine vision-based localization by indexing scale-invariant features. IEEE Trans. Syst. Man Cybernet. Part B (Cybernetics) 36(2), 413–422 (2006)

    Article  Google Scholar 

  21. Newman, P., Cole, D., Ho, K.: Outdoor slam using visual appearance and laser ranging. In: Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006, pp 1180–1187. IEEE (2006)

  22. Cummins, M., Newman, P.: Probabilistic appearance based navigation and loop closing. In: Proceedings 2007 IEEE International Conference on Robotics and Automation, pp 2042–2048. IEEE (2007)

  23. Cummins, M., Newman, P.: Fab-map: Probabilistic localization and mapping in the space of appearance. Int. J. Robot. Res. 27(6), 647–665 (2008)

    Article  Google Scholar 

  24. Shirazi, S., Jacobson, A., Dayoub, F., Pepperell, E., Upcroft, B., Milford, M.: Place recognition with convnet landmarks: Viewpoint-robust, condition-robust, training-free. Proceedings of Robotics: Science and Systems XII (2015)

  25. Chen, Z., Jacobson, A., Sünderhauf, N., Upcroft, B., Liu, L., Shen, C., Reid, I., Milford, M.: Deep learning features at scale for visual place recognition. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp 3223–3230. IEEE (2017)

  26. Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: Netvlad: Cnn architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5297–5307 (2016)

  27. Memon, A.R., Wang, H., Hussain, A.: Loop closure detection using supervised and unsupervised deep neural networks for monocular slam systems. Robot. Auton. Syst. 126, 103470 (2020)

    Article  Google Scholar 

  28. Merrill, N., Huang, G.: Lightweight unsupervised deep loop closure. arXiv:1805.07703(2018)

  29. Cascianelli, S., Costante, G., Bellocchio, E., Valigi, P., Fravolini, M.L., Ciarfuglia, T.A.: Robust visual semi-semantic loop closure detection by a covisibility graph and cnn features. Robot. Auton. Syst. 92, 53–65 (2017)

    Article  Google Scholar 

  30. Gao, X., Zhang, T.: Unsupervised learning to detect loops using deep neural networks for visual slam system. Auton. Robots 41(1), 1–18 (2017)

    Article  MathSciNet  Google Scholar 

  31. Tesauro, G., et al.: Temporal difference learning and td-gammon. Commun. ACM 38(3), 58–68 (1995)

    Article  Google Scholar 

  32. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)

    Article  Google Scholar 

  33. Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn, 8(3), 279–292 (1992)

    Article  Google Scholar 

  34. Greenwald, A., Hall, K., Serrano, R., et al: Correlated q-learning. In: ICML, vol. 3, pp 242–249 (2003)

  35. Abed-alguni, B.H., Ottom, M.A.: Double delayed q-learning. Int. J. Artif. Intell. 16(2), 41–59 (2018)

    Google Scholar 

  36. Zhu, Y., Mottaghi, R., Kolve, E., Lim, J.J., Gupta, A., Li, F.-F., Farhadi, A.: Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp 3357–3364. IEEE (2017)

  37. Mirowski, P., Pascanu, R., Viola, F., Soyer, H., Ballard, A.J., Banino, A., Denil, M., Goroshin, R., Sifre, L., Kavukcuoglu, K., et al: Learning to navigate in complex environments. arXiv:1611.03673 (2016)

  38. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning. arXiv:1312.5602 (2013)

  39. Tassa, Y., Doron, Y., Muldal, A., Erez, T., Li, Y., Casas, D.D.L., Budden, D., Abdolmaleki, A., Merel, J., Lefrancq, A., et al.: Deepmind control suite. arXiv:1801.00690 (2018)

  40. Barth-Maron, G., Hoffman, M.W., Budden, D., Dabney, W., Horgan, D., Tb, D., Muldal, A., Heess, N., Lillicrap, T.: Distributed distributional deterministic policy gradients. arXiv:1804.08617(2018)

  41. Iqbal, A., Gans, N.R.: Data association and localization of classified objects in visual slam

  42. Iqbal, A., Gans, N.R.: Localization of classified objects in slam using nonparametric statistics and clustering. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp 161–168. IEEE (2018)

  43. Lee, K., Kim, S.-A., Choi, J., Lee, S.-W.: Deep reinforcement learning in continuous action spaces: a case study in the game of simulated curling. In: International Conference on Machine Learning, pp 2943–2952 (2018)

  44. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv:1707.06347 (2017)

  45. Kingma Diederik, P., Adam, J.B.: A method for stochastic optimization. arXiv:1412.6980(2014)

  46. Muñoz-Salinas, R, Medina-Carnicer, R.: Ucoslam: Simultaneous localization and mapping by fusion of keypoints and squared planar markers. Pattern Recognition, page 107193. ISSN 0031-3203 (2020)

  47. Mur-Artal, R.L., Tardós, J.D.: Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras IEEE Transactions on Robotics (2017)

  48. Haas, J.K.: A history of the unity game engine (2014)

Download references

Acknowledgements

This research has been partially funded by the Advanced Driver Assistance System (ADAS) group at Texas Instruments (TI) in Dallas, TX, and by the University of Texas at Arlington Research Institute.

Funding

This work was supported by the University of Texas at Arlington Research Institute and the Advanced Driver Assistance System (ADAS) group at Texas Instruments (TI) in Dallas, TX.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Asif Iqbal. The lab experiments and data capture were accomplished by Rhitu Thapa and Asif Iqbal. The first draft of the manuscript was written by Asif Iqbal, and both authors edited and revised the manuscript. Management of the study and manuscript preparation was performed by Nicholas Gans.

Corresponding author

Correspondence to Asif Iqbal.

Ethics declarations

Ethics Approval

There are no human or animal subjects in this study. No ethical approval is required.

Consent for Publication

There are no human subjects in this study. There is no necessary consent to publish.

Consent for Publication

There are no human subjects in this study. There is no necessary consent to publish.

Competing interests

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Material and/or Code availability

Data sharing is not applicable to this article as no datasets were generated or analysed during the current study. Data for this research consists of simulation data and video captured by a moving robot. Video data is not publicly available, but will be retained by the authors and be made available upon reasonable request. Software written by the authors will not be publicly shared.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by the University of Texas at Arlington Research Institute and the Advanced Driver Assistance System (ADAS) group at Texas Instruments (TI) in Dallas, TX

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (mp4 130 KB)

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Iqbal, A., Thapa, R. & Gans, N.R. Deep Reinforcement Learning Based Loop Closure Detection. J Intell Robot Syst 106, 51 (2022). https://doi.org/10.1007/s10846-022-01720-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10846-022-01720-2

Keywords

Navigation