A review on monocular tracking and mapping: from model-based to data-driven methods

Gadipudi, Nivesh; Elamvazuthi, Irraivan; Izhar, Lila Iznita; Tiwari, Lokender; Hebbalaguppe, Ramya; Lu, Cheng-Kai; Doss, Arockia Selvakumar Arockia

doi:10.1007/s00371-022-02702-z

A review on monocular tracking and mapping: from model-based to data-driven methods

Survey
Published: 17 November 2022

Volume 39, pages 5897–5924, (2023)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Nivesh Gadipudi¹,
Irraivan Elamvazuthi ORCID: orcid.org/0000-0002-4721-9400¹,
Lila Iznita Izhar¹,
Lokender Tiwari²,
Ramya Hebbalaguppe^2,3,
Cheng-Kai Lu⁴ &
…
Arockia Selvakumar Arockia Doss⁵

892 Accesses
3 Citations
2 Altmetric
Explore all metrics

Abstract

Visual odometry and visual simultaneous localization and mapping aid in tracking the position of a camera and mapping the surroundings using images. It is an important part of robotic perception. Tracking and mapping using a monocular camera is cost-effective, requires less calibration effort, and is easy to deploy across a wide range of applications. This paper provides an extensive review of the developments for the first two decades of the twenty-first century. Astounding results from early methods based on filtering have intrigued the community to extend these algorithms using other forms of techniques like bundle adjustment and deep learning. This article starts by introducing the basic sensor systems and analyzing the evolution of monocular tracking and mapping algorithms through bibliometric data. Then, it covers the overview of filtering and bundle adjustment methods, followed by recent advancements in methods using deep learning with the mathematical constraints applied on the networks. Finally, the popular benchmarks available for developing and evaluating these algorithms are presented along with a comparative study on a different class of algorithms. It is anticipated that this article will serve as the latest introductory tool and further ignite the interest of the community to solve current and future impediments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast and accurate visual odometry from a monocular camera

Article 16 July 2019

Analysis of the Effect of Sensors for End-to-End Machine Learning Odometry

A Comparison of Deep Learning-Based Monocular Visual Odometry Algorithms

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Nourani-Vatani, N., Roberts, J., Srinivasan, M.V.: Practical visual odometry for car-like vehicles. In: 2009 IEEE International Conference on Robotics and Automation, pp. 3551–3557 (2009). IEEE
Helmick, D.M., Cheng, Y., Clouse, D.S., Matthies, L.H., Roumeliotis, S.I.: Path following using visual odometry for a mars rover in high-slip environments. In: 2004 IEEE Aerospace Conference Proceedings (IEEE Cat. No. 04TH8720), Vol. 2, pp. 772–789 (2004). IEEE
Woodman, O.J.: An introduction to inertial navigation. Technical report, University of Cambridge, Computer Laboratory (2007)
Jiang, W., Yin, Z.: Combining passive visual cameras and active imu sensors for persistent pedestrian tracking. J. Vis. Commun. Image Represent. 48, 419–431 (2017). https://doi.org/10.1016/j.jvcir.2017.03.015
Article Google Scholar
Aqel, M.O., Marhaban, M.H., Saripan, M.I., Ismail, N.B.: Review of visual odometry: types, approaches, challenges, and applications. Springerplus 5(1), 1–26 (2016). https://doi.org/10.1186/s40064-016-3573-7
Article Google Scholar
Debeunne, C., Vivet, D.: A review of visual-lidar fusion based simultaneous localization and mapping. Sensors 20(7), 2068 (2020). https://doi.org/10.3390/s20072068
Article Google Scholar
Zaffar, M., Ehsan, S., Stolkin, R., Maier, K.M.: Sensors, slam and long-term autonomy: a review. In: 2018 NASA/ESA Conference on Adaptive Hardware and Systems (AHS), pp. 285–290 (2018). IEEE
Yousif, K., Bab-Hadiashar, A., Hoseinnezhad, R.: An overview to visual odometry and visual slam: applications to mobile robotics. Intell. Ind. Syst. 1, 289–311 (2015)
Article Google Scholar
Younes, G., Asmar, D.C., Shammas, E.: A survey on non-filter-based monocular visual slam systems. arXiv:1607.00470 (2016)
Younes, G., Asmar, D., Shammas, E., Zelek, J.: Keyframe-based monocular slam: design, survey, and future directions. Robot. Auton. Syst. 98, 67–88 (2017). https://doi.org/10.1016/j.robot.2017.09.010
Article Google Scholar
Davison, A.J., Reid, I.D., Molton, N.D., Stasse, O.: Monoslam: real-time single camera slam. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1052–1067 (2007). https://doi.org/10.1109/TPAMI.2007.1049
Article Google Scholar
Triggs, B., McLauchlan, P.F., Hartley, R.I., Fitzgibbon, A.W.: Bundle adjustment-a modern synthesis. In: International Workshop on Vision Algorithms, pp. 298–372 (1999). Springer
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60, 84–90 (2012)
Article Google Scholar
Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping: part I. IEEE Robot. Autom. Mag. 13(2), 99–110 (2006). https://doi.org/10.1109/MRA.2006.1638022
Article Google Scholar
Nistér, D., Naroditsky, O., Bergen, J.R.: Visual odometry. In; Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004. Vol. 1, (2004)
Wei, Y., Kang, L., Yang, B., Wu, L.: Applications of structure from motion: a survey. J. Zhejiang Univ. Sci. C 14, 486–494 (2013). https://doi.org/10.1631/jzus.CIDE1302
Article Google Scholar
Song, S., Chandraker, M.: Robust scale estimation in real-time monocular sfm for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1566–1573 (2014)
Zhou, D., Dai, Y., Li, H.: Ground-plane-based absolute scale estimation for monocular visual odometry. IEEE Trans. Intell. Transp. Syst. 21(2), 791–802 (2019). https://doi.org/10.1109/TITS.2019.2900330
Article Google Scholar
He, M., Zhu, C., Huang, Q., Ren, B., Liu, J.: A review of monocular visual odometry. Vis. Comput. 36(5), 1053–1065 (2020). https://doi.org/10.1007/s00371-019-01714-6
Article Google Scholar
Milz, S., Arbeiter, G., Witt, C., Abdallah, B., Yogamani, S.: Visual slam for automated driving: Exploring the applications of deep learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 247–257 (2018)
Mahmoud, N., Grasa, Ó.G., Nicolau, S.A., Doignon, C., Soler, L., Marescaux, J., Montiel, J.: On-patient see-through augmented reality based on visual slam. Int. J. Comput. Assist. Radiol. Surg. 12(1), 1–11 (2017)
Article Google Scholar
Yu, K., Ahn, J., Lee, J., Kim, M., Han, J.: Collaborative slam and ar-guided navigation for floor layout inspection. Vis. Comput. 36(10), 2051–2063 (2020)
Article Google Scholar
Marchand, É., Courty, N.: Controlling a camera in a virtual environment. Vis. Comput. 18(1), 1–19 (2002)
Article MATH Google Scholar
Grasa, O.G., Bernal, E., Casado, S., Gil, I., Montiel, J.: Visual slam for handheld monocular endoscope. IEEE Trans. Med. Imaging 33(1), 135–146 (2013). https://doi.org/10.1109/TMI.2013.2282997
Article Google Scholar
Liu, X., Sinha, A., Ishii, M., Hager, G.D., Reiter, A., Taylor, R.H., Unberath, M.: Dense depth estimation in monocular endoscopy with self-supervised learning methods. IEEE Trans. Med. Imaging 39(5), 1438–1447 (2019). https://doi.org/10.1109/TMI.2019.2950936
Article Google Scholar
Hartley, R.I.: In defense of the eight-point algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 19(6), 580–593 (1997). https://doi.org/10.1109/34.601246
Article Google Scholar
Zhang, Z.: Determining the epipolar geometry and its uncertainty: a review. Int. J. Comput. Vision 27(2), 161–195 (1998). https://doi.org/10.1023/A:1007941100561
Article Google Scholar
Zhu, R., Yang, M., Liu, W., Song, R., Yan, B., Xiao, Z.: Deepavo: Efficient pose refining with feature distilling for deep visual odometry. Neurocomputing 467, 22–35 (2022). https://doi.org/10.1016/j.neucom.2021.09.029
Article Google Scholar
Bailey, T., Durrant-Whyte, H.: Simultaneous localization and mapping (slam): part II. IEEE Robot. Autom. Mag. 13(3), 108–117 (2006). https://doi.org/10.1109/MRA.2006.1678144
Article Google Scholar
Scaramuzza, D., Fraundorfer, F.: Visual odometry [tutorial]. IEEE Robot. Autom. Mag. 18(4), 80–92 (2011). https://doi.org/10.1109/MRA.2011.943233
Article Google Scholar
Fraundorfer, F., Scaramuzza, D.: Visual odometry: part II: matching, robustness, optimization, and applications. IEEE Robot. Autom. Mag. 19(2), 78–90 (2012). https://doi.org/10.1109/MRA.2012.2182810
Article Google Scholar
Taketomi, T., Uchiyama, H., Ikeda, S.: Visual slam algorithms: a survey from 2010 to 2016. IPSJ Trans. Comput. Vis. Appl. 9, 1–11 (2017). https://doi.org/10.1186/s41074-017-0027-2
Article Google Scholar
Li, R., Wang, S., Gu, D.: Ongoing evolution of visual slam from geometry to deep learning: challenges and opportunities. Cogn. Comput. 10, 875–889 (2018). https://doi.org/10.1007/s12559-018-9591-8
Article Google Scholar
Taheri, H., Xia, Z.C.: Slam; definition and evolution. Eng. Appl. Artif. Intell. 97, 104032 (2021). https://doi.org/10.1016/j.engappai.2020.104032
Article Google Scholar
Cadena, C., Carlone, L., Carrillo, H., Latif, Y., Scaramuzza, D., Neira, J., Reid, I., Leonard, J.: Past, present, and future of simultaneous localization and mapping: toward the robust-perception age. IEEE Trans. Robot. 32, 1309–1332 (2016). https://doi.org/10.1109/TRO.2016.2624754
Article Google Scholar
Saputra, M.R.U., Markham, A., Trigoni, A.: Visual slam and structure from motion in dynamic environments. ACM Comput. Surv. 51, 1–36 (2018). https://doi.org/10.1145/3177853
Article Google Scholar
Pan, J., Li, L., Yamaguchi, H., Hasegawa, K., Thufail, F.I., Tanaka, S., et al.: 3d reconstruction of borobudur reliefs from 2d monocular photographs based on soft-edge enhanced deep learning. ISPRS J. Photogramm. Remote. Sens. 183, 439–450 (2022). https://doi.org/10.1016/j.isprsjprs.2021.11.007
Article Google Scholar
Davison, A.J.: Real-time simultaneous localisation and mapping with a single camera. In: IEEE International Conference on Computer Vision, vol. 3, pp. 1403–1403 (2003). IEEE Computer Society
Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: Orb-slam: a versatile and accurate monocular slam system. IEEE Trans. Robot. 31(5), 1147–1163 (2015). https://doi.org/10.1109/TRO.2015.2463671
Article Google Scholar
Liu, Y., Chen, X., Gu, T., Zhang, Y., Xing, G.: Real-time camera pose estimation via line tracking. Vis. Comput. 34(6), 899–909 (2018)
Article Google Scholar
Maity, S., Saha, A., Bhowmick, B.: Edge slam: Edge points based monocular visual slam. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 2408–2417 (2017)
Dong, Y., Wang, S., Yue, J., Chen, C., He, S., Wang, H., He, B.: A novel texture-less object oriented visual slam system. IEEE Trans. Intell. Transp. Syst. (2019)
Yang, S., Scherer, S.: Cubeslam: monocular 3-d object slam. IEEE Trans. Robot. 35(4), 925–938 (2019). https://doi.org/10.1109/TRO.2019.2909168
Article Google Scholar
Tuytelaars, T., Mikolajczyk, K.: Local Invariant Feature Detectors: a Survey. Now Publishers Inc, (2008)
Li, Y., Wang, S., Tian, Q., Ding, X.: A survey of recent advances in visual feature detection. Neurocomputing 149, 736–751 (2015). https://doi.org/10.1016/j.neucom.2014.08.003
Article Google Scholar
Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 6, 679–698 (1986). https://doi.org/10.1109/TPAMI.1986.4767851
Article Google Scholar
Harris, C.G., Stephens, M., et al.: A combined corner and edge detector. In: Alvey Vision Conference, vol. 15, pp. 10–5244 (1988). Citeseer
Shi, J., et al.: Good features to track. In: 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 593–600 (1994). IEEE
Piniés, P., Tardós, J.D.: Large-scale slam building conditionally independent local maps: application to monocular vision. IEEE Trans. Rob. 24(5), 1094–1106 (2008). https://doi.org/10.1109/TRO.2008.2004637
Article Google Scholar
Kwon, J., Lee, K.M.: Monocular slam with locally planar landmarks via geometric rao-blackwellized particle filtering on lie groups. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1522–1529 (2010). IEEE
Clemente, L.A., Davison, A.J., Reid, I.D., Neira, J., Tardós, J.D.: Mapping large loops with a single hand-held camera. In: Robotics: Science and Systems, vol. 2 (2007)
Holmes, S.A., Klein, G., Murray, D.W.: An o (n$^2$) square root unscented kalman filter for visual simultaneous localization and mapping. IEEE Trans. Pattern Anal. Mach. Intell. 31(7), 1251–1263 (2008). https://doi.org/10.1109/TPAMI.2008.189
Article Google Scholar
Celik, K., Chung, S.-J., Clausman, M., Somani, A.K.: Monocular vision slam for indoor aerial vehicles. In: 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1566–1573 (2009). IEEE
Liu, J., Liu, D., Cheng, J., Tang, Y.: Conditional simultaneous localization and mapping: a robust visual slam system. Neurocomputing 145, 269–284 (2014). https://doi.org/10.1016/j.neucom.2014.05.034
Article Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004). https://doi.org/10.1023/B:VISI.0000029664.99615.94
Article Google Scholar
Rosten, E., Drummond, T.: Machine learning for high-speed corner detection. In: European Conference on Computer Vision, pp. 430–443 (2006). Springer
Klein, G., Murray, D.: Parallel tracking and mapping for small ar workspaces. In: 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, pp. 225–234 (2007). IEEE
Herrera, D.C., Kim, K., Kannala, J., Pulli, K., Heikkilä, J.: Dt-slam: Deferred triangulation for robust slam. In: 2014 2nd International Conference on 3D Vision, vol. 1, pp. 609–616 (2014). IEEE
Pumarola, A., Vakhitov, A., Agudo, A., Sanfeliu, A., Moreno-Noguer, F.: Pl-slam: Real-time monocular visual slam with points and lines. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 4503–4508 (2017). IEEE
Ma, J., Jiang, X., Fan, A., Jiang, J., Yan, J.: Image matching from handcrafted to deep features: a survey. Int. J. Comput. Vis. 129(1), 23–79 (2021). https://doi.org/10.1007/s11263-020-01359-2
Article MathSciNet MATH Google Scholar
Chen, L., Rottensteiner, F., Heipke, C.: Feature detection and description for image matching: from hand-crafted design to deep learning. Geo-Spatial Inf. Sci. 24(1), 58–74 (2021). https://doi.org/10.1080/10095020.2020.1843376
Article Google Scholar
Martins, P.F., Costelha, H., Bento, L.C., Neves, C.: Monocular camera calibration for autonomous driving-a comparative study. In: 2020 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp. 306–311 (2020). IEEE
Nistér, D.: An efficient solution to the five-point relative pose problem. IEEE Trans. Pattern Anal. Mach. Intell. 26(6), 756–770 (2004). https://doi.org/10.1109/TPAMI.2004.17
Article Google Scholar
Armangué, X., Salvi, J.: Overall view regarding fundamental matrix estimation. Image Vis. Comput. 21(2), 205–220 (2003). https://doi.org/10.1016/S0262-8856(02)00154-3
Article Google Scholar
Lui, V., Drummond, T.: An iterative 5-pt algorithm for fast and robust essential matrix estimation. IJCV 74(2), 117–136 (2007)
Google Scholar
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981). https://doi.org/10.1145/358669.358692
Article MathSciNet Google Scholar
Torr, P.H., Zisserman, A.: Mlesac: a new robust estimator with application to estimating image geometry. Comput. Vis. Image Underst. 78(1), 138–156 (2000). https://doi.org/10.1006/cviu.1999.0832
Article Google Scholar
Yan, K., Zhao, R., Liu, E., Ma, Y.: A robust fundamental matrix estimation method based on epipolar geometric error criterion. IEEE Access 7, 147523–147533 (2019). https://doi.org/10.1109/ACCESS.2019.2946387
Article Google Scholar
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, New York (2003)
MATH Google Scholar
Forster, C., Pizzoli, M., Scaramuzza, D.: Svo: Fast semi-direct monocular visual odometry. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 15–22 (2014). IEEE
Huang, J., Liu, R., Zhang, J., Chen, S.: Fast initialization method for monocular slam based on indoor model. In: 2017 IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 2360–2365 (2017). IEEE
Yang, Y., Xiong, J., She, X., Liu, C., Yang, C., Li, J.: Passive initialization method based on motion characteristics for monocular slam. Complexity 2019, 8176489–1817648911 (2019). https://doi.org/10.1155/2019/8176489
Article Google Scholar
Strasdat, H., Montiel, J., Davison, A.J.: Real-time monocular slam: Why filter? In: 2010 IEEE International Conference on Robotics and Automation, pp. 2657–2664 (2010). IEEE
Ho, T.S., Fai, Y.C., Ming, E.S.L.: Simultaneous localization and mapping survey based on filtering techniques. In: 2015 10th Asian Control Conference (ASCC), pp. 1–6 (2015). IEEE
Huang, S., Dissanayake, G.: Convergence and consistency analysis for extended Kalman filter based slam. IEEE Trans. Robot. 23(5), 1036–1049 (2007). https://doi.org/10.1109/TRO.2007.903811
Article Google Scholar
Guivant, J.E., Nebot, E.M.: Optimization of the simultaneous localization and map-building algorithm for real-time implementation. IEEE Trans. Robot. Autom. 17(3), 242–257 (2001). https://doi.org/10.1109/70.938382
Article Google Scholar
Dissanayake, G., Williams, S.B., Durrant-Whyte, H., Bailey, T.: Map management for efficient simultaneous localization and mapping (slam). Auton. Robot. 12(3), 267–286 (2002). https://doi.org/10.1023/A:1015217631658
Article MATH Google Scholar
Paz, L.M., Piniés, P., Tardós, J.D., Neira, J.: Large-scale 6-dof slam with stereo-in-hand. IEEE Trans. Robot. 24(5), 946–957 (2008)
Article Google Scholar
Mahon, I., Williams, S.B., Pizarro, O., Johnson-Roberson, M.: Efficient view-based slam using visual loop closures. IEEE Trans. Robot. 24(5), 1002–1014 (2008). https://doi.org/10.1109/TRO.2008.2004888
Article Google Scholar
Cadena, C., Neira, J.: Slam in o (logn) with the combined Kalman-information filter. Robot. Auton. Syst. 58(11), 1207–1219 (2010). https://doi.org/10.1016/j.robot.2010.08.003
Article Google Scholar
He, B., Liu, Y., Dong, D., Shen, Y., Yan, T., Nian, R.: Simultaneous localization and mapping with iterative sparse extended information filter for autonomous vehicles. Sensors 15(8), 19852–19879 (2015). https://doi.org/10.3390/s150819852
Article Google Scholar
Wan, E.A., Van Der Merwe, R.: The unscented Kalman filter for nonlinear estimation. In: Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium (Cat. No. 00EX373), pp. 153–158 (2000). IEEE
Civera, J., Davison, A.J., Montiel, J.M.: Inverse depth parametrization for monocular slam. IEEE Trans. Robot. 24(5), 932–945 (2008). https://doi.org/10.1109/TRO.2008.2003276
Article Google Scholar
Montemerlo, M., Thrun, S., Koller, D., Wegbreit, B., et al.: Fastslam: A factored solution to the simultaneous localization and mapping problem. Aaai/iaai Vol. 593598 (2002)
Montemerlo, M., Thrun, S., Koller, D., Wegbreit, B., et al.: Fastslam 2.0: An improved particle filtering algorithm for simultaneous localization and mapping that provably converges. In: IJCAI, vol. 3, pp. 1151–1156 (2003)
Pupilli, M., Calway, A.: Real-time camera tracking using a particle filter. In: BMVC (2005)
Hoseini, S.A., Kabiri, P.: A novel feature-based approach for indoor monocular slam. Electronics 7(11), 305 (2018). https://doi.org/10.3390/electronics7110305
Article Google Scholar
Angeli, A., Doncieux, S., Meyer, J.-A., Filliat, D.: Real-time visual loop-closure detection. In: 2008 IEEE International Conference on Robotics and Automation, pp. 1842–1847 (2008). IEEE
Lee, S.-H.: Real-time camera tracking using a particle filter combined with unscented kalman filters. J. Electron. Imaging 23(1), 013029 (2014). https://doi.org/10.1117/1.JEI.23.1.013029
Article MathSciNet Google Scholar
Zhou, H., Zou, D., Pei, L., Ying, R., Liu, P., Yu, W.: Structslam: Visual slam with building structure lines. IEEE Trans. Veh. Technol. 64(4), 1364–1375 (2015). https://doi.org/10.1109/TVT.2015.2388780
Article Google Scholar
Tseng, K.-K., Li, J., Chang, Y., Yung, K., Chan, C., Hsu, C.-Y.: A new architecture for simultaneous localization and mapping: an application of a planetary rover. Enterprise Inf. Syst. 15(8), 1162–1178 (2021). https://doi.org/10.1080/17517575.2019.1698772
Article Google Scholar
Gao, X.-S., Hou, X.-R., Tang, J., Cheng, H.-F.: Complete solution classification for the perspective-three-point problem. IEEE Trans. Pattern Anal. Mach. Intell. 25(8), 930–943 (2003). https://doi.org/10.1109/TPAMI.2003.1217599
Article Google Scholar
Lepetit, V., Moreno-Noguer, F., Fua, P.: Epnp: an accurate o (n) solution to the pnp problem. Int. J. Comput. Vis. 81(2), 155 (2009). https://doi.org/10.1007/s11263-008-0152-6
Article Google Scholar
Persson, M., Nordberg, K.: Lambda twist: An accurate fast robust perspective three point (p3p) solver. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 318–332 (2018)
Blochliger, F., Fehr, M., Dymczyk, M., Schneider, T., Siegwart, R.: Topomap: Topological mapping and navigation based on visual slam maps. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 3818–3825 (2018). IEEE
Yang, A., Luo, Y., Chen, L., Xu, Y.: Survey of 3d map in slam: localization and navigation. In: Advanced Computational Methods in Life System Modeling and Simulation, pp. 410–420. Springer (2017)
Cai, Q., Zhang, L., Wu, Y., Yu, W., Hu, D.: A pose-only solution to visual reconstruction and navigation. arXiv preprint arXiv:2103.01530 (2021)
Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: Dtam: Dense tracking and mapping in real-time. In: 2011 International Conference on Computer Vision, pp. 2320–2327 (2011). IEEE
Engel, J., Schöps, T., Cremers, D.: Lsd-slam: Large-scale direct monocular slam. In: European Conference on Computer Vision, pp. 834–849 (2014). Springer
Forster, C., Zhang, Z., Gassner, M., Werlberger, M., Scaramuzza, D.: Svo: Semidirect visual odometry for monocular and multicamera systems. IEEE Trans. Rob. 33(2), 249–265 (2016). https://doi.org/10.1109/TRO.2016.2623335
Article Google Scholar
Concha, A., Civera, J.: Dpptam: Dense piecewise planar tracking and mapping from a monocular sequence. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5686–5693 (2015). IEEE
Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 611–625 (2017). https://doi.org/10.1109/TPAMI.2017.2658577
Article Google Scholar
Zubizarreta, J., Aguinaga, I., Montiel, J.M.M.: Direct sparse mapping. IEEE Trans. Robot. 36(4), 1363–1370 (2020). https://doi.org/10.1109/TRO.2020.2991614
Article Google Scholar
Roberts, R., Nguyen, H., Krishnamurthi, N., Balch, T.: Memory-based learning for visual odometry. In: 2008 IEEE International Conference on Robotics and Automation, pp. 47–52 (2008). IEEE
Guizilini, V., Ramos, F.: Semi-parametric learning for visual odometry. Tnt. J. Robot. Res. 32(5), 526–546 (2013). https://doi.org/10.1177/2F0278364912472245
Article Google Scholar
Konda, K.R., Memisevic, R.: Learning visual odometry with a convolutional network. In: VISAPP (1), pp. 486–490 (2015)
DeTone, D., Malisiewicz, T., Rabinovich, A.: Deep image homography estimation. arXiv preprint arXiv:1606.03798 (2016)
Costante, G., Mancini, M., Valigi, P., Ciarfuglia, T.A.: Exploring representation learning with cnns for frame-to-frame ego-motion estimation. IEEE Robot. Autom. Lett. 1(1), 18–25 (2015). https://doi.org/10.1109/TITS.2019.2952159
Article Google Scholar
Muller, P., Savakis, A.: Flowdometry: An optical flow and deep learning based approach to visual odometry. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 624–631 (2017). IEEE
Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., Van Der Smagt, P., Cremers, D., Brox, T.: Flownet: Learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2758–2766 (2015)
Ummenhofer, B., Zhou, H., Uhrig, J., Mayer, N., Ilg, E., Dosovitskiy, A., Brox, T.: Demon: Depth and motion network for learning monocular stereo. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5038–5047 (2017)
Costante, G., Ciarfuglia, T.A.: Ls-vo: Learning dense optical subspace for robust visual odometry estimation. IEEE Robot. Autom. Lett. 3(3), 1735–1742 (2018). https://doi.org/10.1109/LRA.2018.2803211
Article Google Scholar
Pandey, T., Pena, D., Byrne, J., Moloney, D.: Leveraging deep learning for visual odometry using optical flow. Sensors 21(4), 1313 (2021). https://doi.org/10.3390/s21041313
Article Google Scholar
Wang, H., Ban, X., Ding, F., Xiao, Y., Zhou, J.: Monocular vo based on deep siamese convolutional neural network. Complexity (2020). https://doi.org/10.1155/2020/6367273
Saputra, M.R.U., de Gusmao, P.P., Wang, S., Markham, A., Trigoni, N.: Learning monocular visual odometry through geometry-aware curriculum learning. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 3549–3555 (2019). IEEE
Wang, S., Clark, R., Wen, H., Trigoni, N.: End-to-end, sequence-to-sequence probabilistic visual odometry through deep neural networks. Int. J. Robot. Res. 37(4–5), 513–542 (2018). https://doi.org/10.1177/2F0278364917734298
Article Google Scholar
Kendall, A., Cipolla, R.: Geometric loss functions for camera pose regression with deep learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5974–5983 (2017)
Gadipudi, N., Elamvazuthi, I., Lu, C.-K., Paramasivam, S., Su, S.: Wpo-net: Windowed pose optimization network for monocular visual odometry estimation. Sensors 21(23), 8155 (2021). https://doi.org/10.3390/s21238155
Article Google Scholar
Wang, X., Zhang, H.: Deep monocular visual odometry for ground vehicle. IEEE Access 8, 175220–175229 (2020). https://doi.org/10.1109/ACCESS.2020.3025557
Article Google Scholar
Saputra, M.R.U., de Gusmao, P.P., Almalioglu, Y., Markham, A., Trigoni, N.: Distilling knowledge from a deep pose regressor network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 263–272 (2019)
Koumis, A.S., Preiss, J.A., Sukhatme, G.S.: Estimating metric scale visual odometry from videos using 3d convolutional networks. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 265–272 (2019). IEEE
Zhai, G., Liu, L., Zhang, L., Liu, Y., Jiang, Y.: Poseconvgru: a monocular approach for visual ego-motion estimation by learning. Pattern Recogn. 102, 107187 (2020). https://doi.org/10.1016/j.patcog.2019.107187
Article Google Scholar
Kuo, X.-Y., Liu, C., Lin, K.-C., Lee, C.-Y.: Dynamic attention-based visual odometry. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 36–37 (2020)
Gadipudi, N., Elamvazuthi, I., Lu, C.-K., Paramasivam, S., Su, S.: Lightweight spatial attentive network for vehicular visual odometry estimation in urban environments. Neural Computing and Applications, 1–14 (2022). https://doi.org/10.1007/s00521-022-07484-y
Xue, F., Wang, X., Li, S., Wang, Q., Wang, J., Zha, H.: Beyond tracking: Selecting memory and refining poses for deep visual odometry. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8575–8583 (2019)
Xu, S., Xiong, H., Wu, Q., Wang, Z.: Attention-based long-term modeling for deep visual odometry. In: 2021 Digital Image Computing: Techniques and Applications (DICTA), pp. 1–8. IEEE
Mayer, N., Ilg, E., Hausser, P., Fischer, P., Cremers, D., Dosovitskiy, A., Brox, T.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4040–4048 (2016)
Garg, R., Bg, V.K., Carneiro, G., Reid, I.: Unsupervised cnn for single view depth estimation: geometry to the rescue. In: European Conference on Computer Vision, pp. 740–756 (2016). Springer
Kendall, A., Grimes, M., Cipolla, R.: Posenet: A convolutional network for real-time 6-dof camera relocalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2938–2946 (2015)
Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. arXiv preprint arXiv:1506.02025 (2015)
Zhou, T., Brown, M., Snavely, N., Lowe, D.G.: Unsupervised learning of depth and ego-motion from video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1851–1858 (2017)
Prasad, V., Bhowmick, B.: Sfmlearner++: Learning monocular depth and ego-motion using meaningful geometric constraints. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 2087–2096 (2019). IEEE
Yin, Z., Shi, J.: Geonet: Unsupervised learning of dense depth, optical flow and camera pose. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1983–1992 (2018)
Zou, Y., Luo, Z., Huang, J.-B.: Df-net: Unsupervised joint learning of depth and flow using cross-task consistency. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 36–53 (2018)
Sun, Q., Tang, Y., Zhao, C.: Cycle-sfm: Joint self-supervised learning of depth and camera motion from monocular image sequences. Chaos: Interdiscip. J. Nonlinear Sci. 29(12), 123102 (2019). https://doi.org/10.1063/1.5120605
Article Google Scholar
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. arXiv preprint arXiv:1406.2661 (2014)
Almalioglu, Y., Saputra, M.R.U., de Gusmao, P.P., Markham, A., Trigoni, N.: Ganvo: Unsupervised deep monocular visual odometry and depth estimation with generative adversarial networks. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 5474–5480 (2019). IEEE
Li, S., Xue, F., Wang, X., Yan, Z., Zha, H.: Sequential adversarial learning for self-supervised deep visual odometry. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2851–2860 (2019)
Zhao, C., Yen, G.G., Sun, Q., Zhang, C., Tang, Y.: Masked gan for unsupervised depth and pose prediction with scale consistency. IEEE Trans. Neural Netw. Learn. Syst. (2020). https://doi.org/10.1109/TNNLS.2020.3044181
Article Google Scholar
Bian, J.-W., Li, Z., Wang, N., Zhan, H., Shen, C., Cheng, M.-M., Reid, I.: Unsupervised scale-consistent depth and ego-motion learning from monocular video. arXiv preprint arXiv:1908.10553 (2019)
Zou, Y., Ji, P., Tran, Q.-H., Huang, J.-B., Chandraker, M.: Learning monocular visual odometry via self-supervised long-term modeling. In: Proceedings of 16th European Conference Computer Vision–ECCV 2020, Glasgow, UK, August 23–28, 2020, Part XIV 16, pp. 710–727 (2020). Springer
Lu, Y., Xu, X., Ding, M., Lu, Z., Xiang, T.: A global occlusion-aware approach to self-supervised monocular visual odometry. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 2260–2268 (2021)
Liu, Y., Wang, H., Wang, J., Wang, X.: Unsupervised monocular visual odometry based on confidence evaluation. IEEE Trans. Intell. Transp. Syst. (2021). https://doi.org/10.1109/TITS.2021.3053412
Article Google Scholar
Sarlin, P.-E., Unagar, A., Larsson, M., Germain, H., Toft, C., Larsson, V., Pollefeys, M., Lepetit, V., Hammarstrand, L., Kahl, F., et al.: Back to the feature: Learning robust camera localization from pixels to pose. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3247–3257 (2021)
Zhang, J., Su, Q., Liu, P., Xu, C., Chen, Y.: Unsupervised learning of monocular depth and ego-motion with space-temporal-centroid loss. Int. J. Mach. Learn. Cybern. 11(3), 615–627 (2020). https://doi.org/10.1007/s13042-019-01020-6
Article Google Scholar
Liu, Q., Li, R., Hu, H., Gu, D.: Using unsupervised deep learning technique for monocular visual odometry. Ieee Access 7, 18076–18088 (2019). https://doi.org/10.1109/ACCESS.2019.2896988
Article Google Scholar
Wang, A., Fang, Z., Gao, Y., Tan, S., Wang, S., Ma, S., Hwang, J.-N.: Adversarial learning for joint optimization of depth and ego-motion. IEEE Trans. Image Process. 29, 4130–4142 (2020). https://doi.org/10.1109/TIP.2020.2968751
Article MATH Google Scholar
Ding, Y., Barath, D., Yang, J., Kukelova, Z.: Relative pose from a calibrated and an uncalibrated smartphone image. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12766–12775 (2022)
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., Chintala, S.: Pytorch: An imperative style, high-performance deep learning library. In: NeurIPS (2019)
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D., Steiner, B., Tucker, P., Vasudevan, V., Warden, P., Wicke, M., Yu, Y., Zhang, X.: Tensorflow: A system for large-scale machine learning. In: OSDI (2016)
Chen, T., Li, M., Li, Y., Lin, M., Wang, N., Wang, M., Xiao, T., Xu, B., Zhang, C., Zhang, Z.: Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv:1512.01274 (2015)
Tian, C., Fei, L., Zheng, W., Xu, Y., Zuo, W., Lin, C.-W.: Deep learning on image denoising: an overview. Neural Netw. (2020). https://doi.org/10.1016/j.neunet.2020.07.025
Article MATH Google Scholar
Tao, X., Gao, H., Wang, Y., Shen, X., Wang, J., Jia, J.: Scale-recurrent network for deep image deblurring. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8174–8182 (2018)
Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38, 295–307 (2016). https://doi.org/10.1109/TPAMI.2015.2439281
Article Google Scholar
Yi, K., Trulls, E., Lepetit, V., Fua, P.: Lift: Learned invariant feature transform. arXiv:1603.09114 (2016)
DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: Self-supervised interest point detection and description. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 337–33712 (2018)
Ono, Y., Trulls, E., Fua, P., Yi, K.: Lf-net: Learning local features from images. In: NeurIPS (2018)
Altwaijry, H., Veit, A., Belongie, S.J.: Learning to detect and match keypoints with deep architectures. In: BMVC (2016)
Nguyen, T., Chen, S.W., Shivakumar, S.S., Taylor, C.J., Kumar, V.: Unsupervised deep homography: a fast and robust homography estimation model. IEEE Robot. Autom. Lett. 3(3), 2346–2353 (2018). https://doi.org/10.1109/LRA.2018.2809549
Article Google Scholar
Ranftl, R., Koltun, V.: Deep fundamental matrix estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 284–299 (2018)
Balntas, V., Li, S., Prisacariu, V.: Relocnet: Continuous metric learning relocalisation using neural nets. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 751–767 (2018)
Valada, A., Radwan, N., Burgard, W.: Deep auxiliary learning for visual localization and odometry. 2018 IEEE International Conference on Robotics and Automation (ICRA), 6939–6946 (2018)
Radwan, N., Valada, A., Burgard, W.: Vlocnet++: deep multitask learning for semantic visual localization and odometry. IEEE Robot. Autom. Lett. 3, 4407–4414 (2018). https://doi.org/10.1109/LRA.2018.2869640
Article Google Scholar
Brachmann, E., Krull, A., Nowozin, S., Shotton, J., Michel, F., Gumhold, S., Rother, C.: Dsac - differentiable ransac for camera localization. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2492–2500 (2017)
Brachmann, E., Rother, C.: Learning less is more - 6d camera localization via 3d surface regression. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4654–4662 (2018)
Brachmann, E., Rother, C.: Expert sample consensus applied to camera re-localization. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 7524–7533 (2019)
Barath, D., Cavalli, L., Pollefeys, M.: Learning to find good models in ransac. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15744–15753 (2022)
Yin, X., Wang, X., Du, X., Chen, Q.: Scale recovery for monocular visual odometry using depth estimated with deep convolutional neural fields. 2017 IEEE International Conference on Computer Vision (ICCV), 5871–5879 (2017)
Sünderhauf, N., Shirazi, S., Jacobson, A., Dayoub, F., Pepperell, E., Upcroft, B., Milford, M.: Place recognition with convnet landmarks: Viewpoint-robust, condition-robust, training-free. In: Robotics: Science and Systems (2015)
Merrill, N., Huang, G.: Lightweight unsupervised deep loop closure. arXiv:1805.07703 (2018)
Memon, A.R., Wang, H., Hussain, A.: Loop closure detection using supervised and unsupervised deep neural networks for monocular slam systems. Robot. Auton. Syst. 126, 103470 (2020). https://doi.org/10.1016/j.robot.2020.103470
Article Google Scholar
Clark, R., Bloesch, M., Czarnowski, J., Leutenegger, S., Davison, A.: Ls-net: Learning to solve nonlinear least squares for monocular stereo. arXiv:1809.02966 (2018)
Tang, C., Tan, P.: Ba-net: Dense bundle adjustment network. arXiv:1806.04807 (2018)
Zhou, H., Ummenhofer, B., Brox, T.: Deeptam: deep tracking and mapping with convolutional neural networks. Int. J. Comput. Vis. 128(3), 756–769 (2020). https://doi.org/10.1007/s11263-019-01221-0
Article Google Scholar
Tiwari, L., Ji, P., Tran, Q.-H., Zhuang, B., Anand, S., Chandraker, M.: Pseudo rgb-d for self-improving monocular slam and depth prediction. In: European Conference on Computer Vision, pp. 437–455 (2020). Springer
Loo, S.Y., Amiri, A.J., Mashohor, S., Tang, S.H., Zhang, H.: Cnn-svo: Improving the mapping in semi-direct visual odometry using single-image depth prediction. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 5218–5223 (2019). IEEE
Cheng, J., Wang, Z., Zhou, H., Li, L., Yao, J.: Dm-slam: a feature-based slam system for rigid dynamic scenes. ISPRS Int. J. Geo Inf. 9(4), 202 (2020). https://doi.org/10.3390/ijgi9040202
Article Google Scholar
Yang, N., Stumberg, L.v., Wang, R., Cremers, D.: D3vo: Deep depth, deep pose and deep uncertainty for monocular visual odometry. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1281–1292 (2020)
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: The kitti dataset. Int. J. Robot. Res. 32, 1231–1237 (2013). https://doi.org/10.1177/2F0278364913491297
Article Google Scholar
Blanco-Claraco, J.-L., Moreno, F.A., González, J.: The málaga urban dataset: High-rate stereo and lidar in a realistic urban scenario. Int. J. Robot. Res. 33, 207–214 (2014). https://doi.org/10.1177/2F0278364913507326
Article Google Scholar
Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D.: A benchmark for the evaluation of rgb-d slam systems. 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 573–580 (2012)
Maddern, W., Pascoe, G., Linegar, C., Newman, P.: 1 year, 1000 km: The oxford robotcar dataset. Int. J. Robot. Res. 36, 15–30 (2017). https://doi.org/10.1177/2F0278364916679498
Article Google Scholar
Carlevaris-Bianco, N., Ushani, A.K., Eustice, R.: University of michigan north campus long-term vision and lidar dataset. Int. J. Robot. Res. 35, 1023–1035 (2016). https://doi.org/10.1177/2F0278364915614638
Article Google Scholar
Burri, M., Nikolic, J., Gohl, P., Schneider, T., Rehder, J., Omari, S., Achtelik, M., Siegwart, R.: The Euroc micro aerial vehicle datasets. Int. J. Robot. Res. 35, 1157–1163 (2016). https://doi.org/10.1177/2F0278364915620033
Article Google Scholar
Majdik, A., Till, C., Scaramuzza, D.: The zurich urban micro aerial vehicle dataset. Int. J. Robot. Res. 36, 269–273 (2017). https://doi.org/10.1177/2F0278364917702237
Article Google Scholar
Smith, M., Baldwin, I., Churchill, W., Paul, R., Newman, P.: The new college vision and laser data set. Int. J. Robot. Res. 28, 595–599 (2009). https://doi.org/10.1177/2F0278364909103911
Article Google Scholar
Huang, A.S., Antone, M.E., Olson, E., Fletcher, L., Moore, D., Teller, S., Leonard, J.: A high-rate, heterogeneous data set from the darpa urban challenge. Int. J. Robot. Res. 29, 1595–1601 (2010). https://doi.org/10.1177/2F0278364910384295
Article Google Scholar
Pandey, G., McBride, J., Eustice, R.: Ford campus vision and lidar data set. Int. J. Robot. Res. 30, 1543–1552 (2011)
Article Google Scholar
Engel, J., Usenko, V., Cremers, D.: A photometrically calibrated benchmark for monocular visual odometry. arXiv:1607.02555 (2016). https://doi.org/10.1177/2F0278364911400640
Dosovitskiy, A., Ros, G., Codevilla, F., López, A., Koltun, V.: Carla: An open urban driving simulator. arXiv:1711.03938 (2017)
Li, W., Saeedi, S., McCormac, J., Clark, R., Tzoumanikas, D., Ye, Q., Huang, Y., Tang, R., Leutenegger, S.: Interiornet: Mega-scale multi-sensor photo-realistic indoor scenes dataset. arXiv:1809.00716 (2018)
Kirsanov, P., Gaskarov, A., Konokhov, F., Sofiiuk, K., Vorontsova, A., Slinko, I., Zhukov, D., Bykov, S., Barinova, O., Konushin, A.: Discoman: Dataset of indoor scenes for odometry, mapping and navigation. 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2470–2477 (2019)
Wang, W., Zhu, D., Wang, X., Hu, Y., Qiu, Y., Wang, C., Kapoor, A., Scherer, S.: Tartanair: A dataset to push the limits of visual slam. 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 4909–4916 (2020)
Shah, S., Dey, D., Lovett, C., Kapoor, A.: Airsim: High-fidelity visual and physical simulation for autonomous vehicles. In: FSR (2017)
Richter, S.R., AlHaija, H.A., Koltun, V.: Enhancing photorealism enhancement. arXiv preprint arXiv:2105.04619 (2021)
Behley, J., Garbade, M., Milioto, A., Quenzel, J., Behnke, S., Stachniss, C., Gall, J.: Semantickitti: A dataset for semantic scene understanding of lidar sequences. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9297–9307 (2019)
Caesar, H., Bankiti, V., Lang, A.H., Vora, S., Liong, V.E., Xu, Q., Krishnan, A., Pan, Y., Baldan, G., Beijbom, O.: nuscenes: A multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11621–11631 (2020)
Sun, P., Kretzschmar, H., Dotiwalla, X., Chouard, A., Patnaik, V., Tsui, P., Guo, J., Zhou, Y., Chai, Y., Caine, B., et al.: Scalability in perception for autonomous driving: Waymo open dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2446–2454 (2020)
Geyer, J., Kassahun, Y., Mahmudi, M., Ricou, X., Durgesh, R., Chung, A.S., Hauswald, L., Pham, V.H., Mühlegg, M., Dorn, S., et al.: A2d2: Audi autonomous driving dataset. arXiv preprint arXiv:2004.06320 (2020)

Download references

Acknowledgements

The authors are grateful to the sponsors who provided YUTP Grant (015LC0-243) for this project.

Author information

Authors and Affiliations

Smart Assistive and Rehabilitative Technology (SMART) Research Group, Department of Electrical and Electronic Engineering, Universiti Teknologi PETRONAS, Bandar Seri Iskandar, 32610, Malaysia
Nivesh Gadipudi, Irraivan Elamvazuthi & Lila Iznita Izhar
TCS Research, New Delhi, India
Lokender Tiwari & Ramya Hebbalaguppe
Indian Institute of Technology Delhi, New Delhi, India
Ramya Hebbalaguppe
Department of Electrical Engineering, National Taiwan Normal University, Taipei, Taiwan
Cheng-Kai Lu
Design and Automation Research Group, School of Mechanical Engineering, Vellore Institute of Technology, Chennai, 600127, India
Arockia Selvakumar Arockia Doss

Authors

Nivesh Gadipudi
View author publications
You can also search for this author inPubMed Google Scholar
Irraivan Elamvazuthi
View author publications
You can also search for this author inPubMed Google Scholar
Lila Iznita Izhar
View author publications
You can also search for this author inPubMed Google Scholar
Lokender Tiwari
View author publications
You can also search for this author inPubMed Google Scholar
Ramya Hebbalaguppe
View author publications
You can also search for this author inPubMed Google Scholar
Cheng-Kai Lu
View author publications
You can also search for this author inPubMed Google Scholar
Arockia Selvakumar Arockia Doss
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Irraivan Elamvazuthi.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Gadipudi, N., Elamvazuthi, I., Izhar, L.I. et al. A review on monocular tracking and mapping: from model-based to data-driven methods. Vis Comput 39, 5897–5924 (2023). https://doi.org/10.1007/s00371-022-02702-z

Download citation

Accepted: 04 October 2022
Published: 17 November 2022
Issue Date: November 2023
DOI: https://doi.org/10.1007/s00371-022-02702-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A review on monocular tracking and mapping: from model-based to data-driven methods

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Fast and accurate visual odometry from a monocular camera

Analysis of the Effect of Sensors for End-to-End Machine Learning Odometry

A Comparison of Deep Learning-Based Monocular Visual Odometry Algorithms

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now