Lightweight spatial attentive network for vehicular visual odometry estimation in urban environments

Gadipudi, Nivesh; Elamvazuthi, Irraivan; Lu, Cheng-Kai; Paramasivam, Sivajothi; Su, Steven

doi:10.1007/s00521-022-07484-y

Lightweight spatial attentive network for vehicular visual odometry estimation in urban environments

Original Article
Published: 24 June 2022

Volume 34, pages 18823–18836, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Nivesh Gadipudi¹,
Irraivan Elamvazuthi ORCID: orcid.org/0000-0002-4721-9400¹,
Cheng-Kai Lu¹,
Sivajothi Paramasivam² &
…
Steven Su³

316 Accesses
1 Citation
2 Altmetric
Explore all metrics

Abstract

Visual odometry is the process of estimating the motion between two consecutive images. Traditional visual odometry algorithms require the careful fabrication of state-of-the-art building blocks based on geometry. These algorithms are highly sensitive to noise, and performance degradation of a single subprocess compromises the performance of the entire system. On the other hand, learning-based methods automatically learn the features required through motion mapping. However, current learning-based methods are computationally expensive and require a significant amount of time to estimate the pose from a video sequence. This method proposes a lightweight deep neural networks architecture to estimate the odometry by exploiting the refined features through spatial attention. Three different training and test splits of the KITTI benchmark are used to effectively evaluate the proposed approach. The execution time of the proposed approach is \(\sim\)1 ms, speeded up by 47 times over [1]. Performed experiments demonstrate the promising performance of the proposed method to the methods used in the comparison.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optical Flow Assisted Monocular Visual Odometry

Learning the Frame-2-Frame Ego-Motion for Visual Odometry with Convolutional Neural Network

A Lightweight Sensor Fusion for Neural Visual Inertial Odometry

References

Wang S, Clark R, Wen H, Trigoni A (2018) End-to-end, sequence-to-sequence probabilistic visual odometry through deep neural networks. Int J Rob Res 37:513–542. https://doi.org/10.1177/0278364917734298
Article Google Scholar
Yousif K, Bab-Hadiashar A, Hoseinnezhad R (2015) An overview to visual odometry and visual slam: applications to mobile robotics. Intell Indus Syst 1(4):289–311. https://doi.org/10.1007/s40903-015-0032-7
Article Google Scholar
Zhai M, Xiang X (2021) Geometry understanding from autonomous driving scenarios based on feature refinement. Neural Comput Appl 33(8):3209–3220. https://doi.org/10.1007/s00521-020-05192-z
Article Google Scholar
Liu K, Li Q, Qiu G (2020) Posegan: a pose-to-image translation framework for camera localization. ISPRS J Photogramm Remote Sens 166:308–315. https://doi.org/10.1016/j.isprsjprs.2020.06.010
Article Google Scholar
Klein, G., Murray, D.: Parallel tracking and mapping for small ar workspaces. In: Proceedings of the IEEE and ACM International symposium on mixed and augmented reality, pp. 225–234 (2007)
Davison AJ, Reid ID, Molton ND, Stasse O (2007) Monoslam: real-time single camera slam. IEEE Trans Pattern Anal Mach Intell 29(6):1052–1067. https://doi.org/10.1109/TPAMI.2007.1049
Article Google Scholar
Mur-Artal R, Montiel JMM, Tardos JD (2015) ORB-SLAM: a versatile and accurate monocular slam system. IEEE Trans Robot 31(5):1147–1163. https://doi.org/10.1109/TRO.2015.2463671
Article Google Scholar
Cao MW, Jia W, Zhao Y, Li SJ, Liu XP (2018) Fast and robust absolute camera pose estimation with known focal length. Neural Comput Appl 29(5):1383–1398. https://doi.org/10.1007/s00521-017-3032-6
Article Google Scholar
Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: Dtam: Dense tracking and mapping in real-time. In: Proceedings of the IEEE International conference on computer vision (ICCV), pp. 2320–2327 (2011)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the kitti vision benchmark suite. In: Proceedings of the IEEE Conference on computer vision and pattern recognition (CVPR), pp. 3354–3361 (2012)
Bay H, Ess A, Tuytelaars T, Van Gool L (2008) Speeded-up robust features (surf). Comput Vis Image Underst 110(3):346–359. https://doi.org/10.1016/j.cviu.2007.09.014
Article Google Scholar
Muja, M., Lowe, D.G.: Fast matching of binary features. In: Proceedings of the IEEE Conference on Computer and Robot Vision, pp. 404–410 (2012)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: Orb: An efficient alternative to sift or surf. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2564–2571 (2011)
Pumarola, A., Vakhitov, A., Agudo, A., Sanfeliu, A., Moreno-Noguer, F.: Pl-slam: Real-time monocular visual slam with points and lines. In: Proceedings of the IEEE International Conference on robotics and automation (ICRA), pp. 4503–4508 (2017)
McCormac, J., Clark, R., Bloesch, M., Davison, A., Leutenegger, S.: Fusion++: Volumetric object-level slam. In: Proceedings of the IEEE International Conference on 3D Vision (3DV), pp. 32–41 (2018)
Herrera, D.C., Kim, K., Kannala, J., Pulli, K., Heikkilä, J.: Dt-slam: Deferred triangulation for robust slam. In: Proceedings of the IEEE International Conference on 3D Vision (3DV), vol. 1, pp. 609–616 (2014)
Engel, J., Schöps, T., Cremers, D.: Lsd-slam: Large-scale direct monocular slam. In: Proceedings of the European Conference on computer vision (ECCV), pp. 834–849 (2014)
Forster, C., Pizzoli, M., Scaramuzza, D.: Svo: Fast semi-direct monocular visual odometry. In: Proceedings of the IEEE International Conference on robotics and automation (ICRA), pp. 15–22 (2014)
Engel J, Koltun V, Cremers D (2017) Direct sparse odometry. IEEE Trans Pattern Anal Mach Intell 40(3):611–625. https://doi.org/10.1109/TPAMI.2017.2658577
Article Google Scholar
Zubizarreta J, Aguinaga I, Montiel JMM (2020) Direct sparse mapping. IEEE Trans Robot 36(4):1363–1370. https://doi.org/10.1109/TRO.2020.2991614
Article Google Scholar
Roberts, R., Nguyen, H., Krishnamurthi, N., Balch, T.: Memory-based learning for visual odometry. In: Proceedings of the IEEE International Conference on robotics and automation (ICRA), pp. 47–52 (2008)
Guizilini V, Ramos F (2013) Semi-parametric learning for visual odometry. Int J Rob Res 32(5):526–546. https://doi.org/10.1177/0278364912472245
Article Google Scholar
Kendall, A., Grimes, M., Cipolla, R.: Posenet: A convolutional network for real-time 6-dof camera relocalization. In: Proceedings of the IEEE International Conference on computer vision (ICCV), pp. 2938–2946 (2015)
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: Flownet 2.0: Evolution of optical flow estimation with deep networks. In: Proceedings of the IEEE Conference on computer vision and pattern Recognition (CVPR), pp. 2462–2470 (2017)
CS Kumar, A., Bhandarkar, S.M., Prasad, M.: Depthnet: A recurrent neural network architecture for monocular depth prediction. In: Proceedings of the IEEE Conference on computer vision and pattern recognition workshops (CVPRW), pp. 283–291 (2018)
Costante G, Mancini M, Valigi P, Ciarfuglia TA (2015) Exploring representation learning with cnns for frame-to-frame ego-motion estimation. IEEE Robot Autom Lett 1(1):18–25. https://doi.org/10.1109/LRA.2015.2505717
Article Google Scholar
Brox, T., Bruhn, A., Papenberg, N., Weickert, J.: High accuracy optical flow estimation based on a theory for warping. In: Proceedings of the European Conference on computer vision (ECCV), pp. 25–36 (2004)
Li, X, Hou, Y, Wang, P, Gao, Z, Xu, M, Li, W Transformer guided geometry model for flow-based unsupervised visual odometry. Neural Comput Appl., 1–12 (2021). https://doi.org/10.1007/s00521-020-05545-8
Muller, P., Savakis, A.: Flowdometry: An optical flow and deep learning based approach to visual odometry. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 624–631 (2017)
Zhao B, Huang Y, Wei H, Hu X (2021) Ego-motion estimation using recurrent convolutional neural networks through optical flow learning. Electronics 10(3):222. https://doi.org/10.3390/electronics10030222
Article Google Scholar
Pandey T, Pena D, Byrne J, Moloney D (2021) Leveraging deep learning for visual odometry using optical flow. Sensors 21(4):1313. https://doi.org/10.3390/s21041313
Article Google Scholar
Sun, D., Yang, X., Liu, M.-Y., Kautz, J.: Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE Conference on computer vision and pattern recognition (CVPR), pp. 8934–8943 (2018)
Hui, T.-W., Tang, X., Loy, C.C.: Liteflownet: A lightweight convolutional neural network for optical flow estimation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition (CVPR), pp. 8981–8989 (2018)
Saputra, M.R.U., Gusmão, P.P.B.D., Almalioglu, Y., Markham, A., Trigoni, A.: Distilling knowledge from a deep pose regressor network, pp. 263–272 (2019)
Wang X, Zhang H (2020) Deep monocular visual odometry for ground vehicle. IEEE Access 8:175220–175229. https://doi.org/10.1109/ACCESS.2020.3025557
Article Google Scholar
Kendall, A., Gal, Y.: What uncertainties do we need in bayesian deep learning for computer vision? In: Proceedings of the Neural Information Processing Systems (NIPS) (2017)
Kendall, A., Cipolla, R.: Geometric loss functions for camera pose regression with deep learning. In: Proceedings of the IEEE Conference on computer vision and pattern recognition (CVPR), pp. 6555–6564 (2017)
Woo, S., Park, J., Lee, J.-Y., Kweon, I.-S.: Cbam: Convolutional block attention module. In: Proceedings of the European Conference on computer vision (ECCV) (2018)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Geiger, A., Ziegler, J., Stiller, C.: Stereoscan: Dense 3d reconstruction in real-time. In: Proceedings of the Intelligent Vehicles Symposium (IV), pp. 963–968 (2011)
Saputra, M.R.U., Gusmão, P.P.B.D., Wang, S., Markham, A., Trigoni, A.: Learning monocular visual odometry through geometry-aware curriculum learning, pp. 3549–3555 (2019)
Liu, Y., Wang, H., Wang, J., Wang, X.: Unsupervised monocular visual odometry based on confidence evaluation. IEEE trans Intell Transp Syst, 1–10 (2021). https://doi.org/10.1109/TITS.2021.3053412
Zhou, T., Brown, M.A., Snavely, N., Lowe, D.: Unsupervised learning of depth and ego-motion from video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6612–6619 (2017)
Yin, Z., Shi, J.: GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose. In: Proceedings of the IEEE Conference on Computer vision and pattern recognition (CVPR), pp. 1983–1992 (2018)
Bian, J.-W., Zhan, H., Wang, N., Li, Z., Zhang, L., Shen, C., Cheng, M.-M., Reid, I.: Unsupervised scale-consistent depth learning from video. Int J Comput Vis., 1–17 (2021). https://doi.org/10.1007/s11263-021-01484-6
Blanco-Claraco J-L, Moreno-Duenas F-A, González-Jiménez J (2014) The málaga urban dataset: High-rate stereo and lidar in a realistic urban scenario. Int J Rob Res 33(2):207–214. https://doi.org/10.1177/0278364913507326
Article Google Scholar

Download references

Acknowledgements

The authors are grateful to the sponsors who provided YUTP Grant (015LC0-243) for this project.

Author information

Authors and Affiliations

Smart Assistive and Rehabilitative Technology (SMART) Research Group, Department of Electrical and Electronic Engineering, Universiti Teknologi PETRONAS, 32610, Bandar Seri Iskandar, Malaysia
Nivesh Gadipudi, Irraivan Elamvazuthi & Cheng-Kai Lu
School of Engineering, UOWMKDU University College, 40150, Shah Alam, Malaysia
Sivajothi Paramasivam
School of Biomedical Engineering, University of Technology Sydney, 2007, Ultimo, Australia
Steven Su

Authors

Nivesh Gadipudi
View author publications
You can also search for this author in PubMed Google Scholar
Irraivan Elamvazuthi
View author publications
You can also search for this author in PubMed Google Scholar
Cheng-Kai Lu
View author publications
You can also search for this author in PubMed Google Scholar
Sivajothi Paramasivam
View author publications
You can also search for this author in PubMed Google Scholar
Steven Su
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Irraivan Elamvazuthi.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gadipudi, N., Elamvazuthi, I., Lu, CK. et al. Lightweight spatial attentive network for vehicular visual odometry estimation in urban environments. Neural Comput & Applic 34, 18823–18836 (2022). https://doi.org/10.1007/s00521-022-07484-y

Download citation

Received: 20 December 2021
Accepted: 26 May 2022
Published: 24 June 2022
Issue Date: November 2022
DOI: https://doi.org/10.1007/s00521-022-07484-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Lightweight spatial attentive network for vehicular visual odometry estimation in urban environments

Abstract

Access this article

Similar content being viewed by others

Optical Flow Assisted Monocular Visual Odometry

Learning the Frame-2-Frame Ego-Motion for Visual Odometry with Convolutional Neural Network

A Lightweight Sensor Fusion for Neural Visual Inertial Odometry

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Lightweight spatial attentive network for vehicular visual odometry estimation in urban environments

Abstract

Access this article

Similar content being viewed by others

Optical Flow Assisted Monocular Visual Odometry

Learning the Frame-2-Frame Ego-Motion for Visual Odometry with Convolutional Neural Network

A Lightweight Sensor Fusion for Neural Visual Inertial Odometry

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation