Parameter-adaptive multi-frame joint pose optimization method

Li, Shaopeng; Xian, Yong; Wu, Wei; Zhang, Tao; Li, Bangjie

doi:10.1007/s00371-022-02476-4

Parameter-adaptive multi-frame joint pose optimization method

Original article
Published: 05 May 2022

Volume 39, pages 2529–2541, (2023)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Shaopeng Li ORCID: orcid.org/0000-0001-7560-9951^1,2,
Yong Xian¹,
Wei Wu¹,
Tao Zhang² &
…
Bangjie Li¹

203 Accesses
1 Altmetric
Explore all metrics

Abstract

Camera pose optimization is the basis of geometric vision works, such as 3D reconstruction, structure from motion, and visual odometry. We designed a multi-frame pose optimization method based on the inverse compositional algorithm. The neural networks are added into the optimization model to improve the problems of hyperparameter selection and loss function design. The multi-frame joint is used to fully utilize the constraints between the sequence images. A multi-layer stepwise method is used, which incorporates scale factors on the loss of each layer to enhance the convergence of the network. The simulation verifies that the proposed method achieves higher precision of pose estimation compared with the state-of-the-art.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimization Algorithm Toward Deep Features Based Camera Pose Estimation

CMT-6D: a lightweight iterative 6DoF pose estimation network based on cross-modal Transformer

Article 17 June 2024

RPNet: An End-to-End Network for Relative Camera Pose Estimation

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Forster, C., Pizzoli, M., Scaramuzza, D.S.: Fast semi-direct monocular visual odometry. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 15–22. IEEE (2014)
Tola, E., Lepetit, V., Fua, P.: Daisy: an efficient dense descriptor applied to wide-baseline stereo. IEEE Trans. Pattern Anal. Mach. Intell. 32(5), 815–830 (2010)
Article Google Scholar
Whelan, T., Kaess, M., Johannsson, H., Fallon, M., Leonard, J.J., McDonald, J.: Real-time large-scale dense rgb-d slam with volumetric fusion. Int. J. Robot. Res. 34(4–5), 598–626 (2015)
Article Google Scholar
Schonberger, J.L., Frahm, J.-M.: Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4104–4113 (2016)
More, J.J.: The Levenberg–Marquardt algorithm: implementation and theory in numerical analysis. Lecture Notes in Mathematics, p. 630 (1977)
Li, S., Zhang, T., Zhang, D., Nie, Y., Wang, J.: Metric learning for patch-based 3-d image registration. In: IEEE Transactions on Automation Science and Engineering (2019)
Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D.: A benchmark for the evaluation of rgb-d slam systems. In: Proceedings of the International Conference on Intelligent Robot Systems (IROS)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (surf). Comput. Vis. Image Underst. 110(3), 346–359 (2008)
Article Google Scholar
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.O.: An efficient alternative to sift or surf. In:2011 IEEE International Conference on Computer Vision (ICCV), pp. 2564–2571. IEEE (2011)
Engel, J., Schöps, T., Cremers, D.: Lsd-slam: Large-scale direct monocular slam. In: European Conference on Computer Vision, pp. 834–849. Springer (2014)
Indra Gandhi, M.P., et al.: Image registration quality assessment with similarity measures-a research study. In: 2015 International Conference on Communications and Signal Processing (ICCSP), pp. 0084–0088. IEEE (2015)
Black, M.J., Anandan, P.: The robust estimation of multiple motions: parametric and piecewise-smooth flow fields. Comput. Vis. Image Underst. 63(1), 75–104 (1996)
Article Google Scholar
Anandan, P.: A computational framework and an algorithm for the measurement of visual motion. Int. J. Comput. Vis. (IJCV) 2(3), 283–310 (1989)
Article Google Scholar
Kendall, A., Grimes, M., Cipolla, R.P.: A convolutional network for real-time 6-dof camera relocalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2938–2946 (2015)
Kendall, A., Cipolla, R., et al.: Geometric loss functions for camera pose regression with deep learning. In: Proceedings of the CVPR, Vol. 3, pp. 8 (2017)
Wu, J., Ma, L., Hu, X.: Delving deeper into convolutional neural networks for camera relocalization. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 5644–5651. IEEE (2017)
Schönberger, J.L., Pollefeys, M., Geiger, A., Sattler, T.: Semantic visual localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6896–6906 (2018)
Melekhov, I., Ylioinas, J., Kannala, J., Rahtu, E.: Relative camera pose estimation using convolutional neural networks. In: International Conference on Advanced Concepts for Intelligent Vision Systems, pp. 675–687. Springer (2017)
Costante, G., Mancini, M., Valigi, P., Ciarfuglia, T.A.: Exploring representation learning with cnns for frame-to-frame ego-motion estimation. IEEE Roboti. Autom. Lett. 1(1), 18–25 (2016)
Article Google Scholar
Wang, S., Clark, R., Wen, H., Trigoni, N.D.: Towards end-to-end visual odometry with deep recurrent convolutional neural networks. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 2043–2050. IEEE (2017)
En, S., Lechervy, A., Jurie, F.: Rpnet: An End-to-End Network for Relative Camera Pose Estimation. Springer, Cham (2018)
Google Scholar
Zhou, T., Brown, M., Snavely, N., Lowe, D.G.: Unsupervised learning of depth and ego-motion from video. In: CVPR, Vol. 2, pp. 7 (2017)
Li, R., Wang, S., Long, Z., Gu, D.: Undeepvo: monocular visual odometry through unsupervised deep learning. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 7286–7291. IEEE (2018)
Vijayanarasimhan, S., Ricco, S., Schmid, C., Sukthankar, R., Fragkiadaki, K.: Sfm-net: learning of structure and motion from video. arXiv:1704.07804, (2017)
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Article Google Scholar
Mahjourian, R., Wicke, M., Angelova, A.: Unsupervised learning of depth and ego-motion from monocular video using 3d geometric constraints. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5667–5675 (2018)
Shen, T., Luo, Z., Zhou, L., Deng, H., Zhang, R., Fang, T., Quan, L.: Beyond photometric loss for self-supervised ego-motion estimation. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 6359–6365. IEEE (2019)
DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: Self-supervised interest point detection and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 224–236 (2018)
Han, X., Leung, T., Jia, Y., Sukthankar, R., Matchnet, A.C.B.: Unifying feature and metric learning for patch-based matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3279–3286 (2015)
Yang, N., Wang, R., Stuckler, J., Cremers, D.: Deep virtual stereo odometry: leveraging deep depth prediction for monocular direct sparse odometry. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 817–833 (2018)
Zhan, H., Weerasekera, C.S., Bian, J., Reid, I.: Visual odometry revisited: What should be learnt? arXiv:1909.09803 (2019)
Tang, J., Ambrus, R., Guizilini, V., Pillai, S., Kim, H., Gaidon, A.: Self-supervised 3d keypoint learning for ego-motion estimation. arXiv:1912.03426 (2019)
Tang, C., Tan, P.: Ba-net: dense bundle adjustment network. arXiv:1806.04807 (2018)
Lv, Z., Dellaert, F., Rehg, J.M., Geiger, A.: Taking a deeper look at the inverse compositional algorithm. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4581–4590 (2019)
Baker, S., Matthews, I.: Lucas-kanade 20 years on: a unifying framework. Int. J. Comput. Vis. 56(3), 221–255 (2004)
Article MATH Google Scholar
Li, Y., Wang, G., Ji, X., Xiang, Y., Fox, D.: Deepim: deep iterative matching for 6d pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 683–698 (2018)

Download references

Funding

This study was funded by the National Natural Science Foundation of China (grant number: 62103432).

Author information

Authors and Affiliations

High-Tech Institute of Xi’an, Xi’an, 710025, China
Shaopeng Li, Yong Xian, Wei Wu & Bangjie Li
Department of Automation, Tsinghua University, Beijing, 100084, China
Shaopeng Li & Tao Zhang

Authors

Shaopeng Li
View author publications
You can also search for this author inPubMed Google Scholar
Yong Xian
View author publications
You can also search for this author inPubMed Google Scholar
Wei Wu
View author publications
You can also search for this author inPubMed Google Scholar
Tao Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Bangjie Li
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Tao Zhang designed the research. Wei wu and Bangjie Li processed the data. Shaopeng Li drafted the manuscript. Yong Xian helped organize the manuscript.

Corresponding author

Correspondence to Shaopeng Li.

Ethics declarations

Conflict of Interest

Shaopeng Li and Tao Zhang declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, S., Xian, Y., Wu, W. et al. Parameter-adaptive multi-frame joint pose optimization method. Vis Comput 39, 2529–2541 (2023). https://doi.org/10.1007/s00371-022-02476-4

Download citation

Accepted: 18 March 2022
Published: 05 May 2022
Issue Date: July 2023
DOI: https://doi.org/10.1007/s00371-022-02476-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parameter-adaptive multi-frame joint pose optimization method

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Optimization Algorithm Toward Deep Features Based Camera Pose Estimation

CMT-6D: a lightweight iterative 6DoF pose estimation network based on cross-modal Transformer

RPNet: An End-to-End Network for Relative Camera Pose Estimation

Explore related subjects

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now