Abstract
Camera pose optimization is the basis of geometric vision works, such as 3D reconstruction, structure from motion, and visual odometry. We designed a multi-frame pose optimization method based on the inverse compositional algorithm. The neural networks are added into the optimization model to improve the problems of hyperparameter selection and loss function design. The multi-frame joint is used to fully utilize the constraints between the sequence images. A multi-layer stepwise method is used, which incorporates scale factors on the loss of each layer to enhance the convergence of the network. The simulation verifies that the proposed method achieves higher precision of pose estimation compared with the state-of-the-art.












Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Forster, C., Pizzoli, M., Scaramuzza, D.S.: Fast semi-direct monocular visual odometry. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 15–22. IEEE (2014)
Tola, E., Lepetit, V., Fua, P.: Daisy: an efficient dense descriptor applied to wide-baseline stereo. IEEE Trans. Pattern Anal. Mach. Intell. 32(5), 815–830 (2010)
Whelan, T., Kaess, M., Johannsson, H., Fallon, M., Leonard, J.J., McDonald, J.: Real-time large-scale dense rgb-d slam with volumetric fusion. Int. J. Robot. Res. 34(4–5), 598–626 (2015)
Schonberger, J.L., Frahm, J.-M.: Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4104–4113 (2016)
More, J.J.: The Levenberg–Marquardt algorithm: implementation and theory in numerical analysis. Lecture Notes in Mathematics, p. 630 (1977)
Li, S., Zhang, T., Zhang, D., Nie, Y., Wang, J.: Metric learning for patch-based 3-d image registration. In: IEEE Transactions on Automation Science and Engineering (2019)
Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D.: A benchmark for the evaluation of rgb-d slam systems. In: Proceedings of the International Conference on Intelligent Robot Systems (IROS)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (surf). Comput. Vis. Image Underst. 110(3), 346–359 (2008)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.O.: An efficient alternative to sift or surf. In:2011 IEEE International Conference on Computer Vision (ICCV), pp. 2564–2571. IEEE (2011)
Engel, J., Schöps, T., Cremers, D.: Lsd-slam: Large-scale direct monocular slam. In: European Conference on Computer Vision, pp. 834–849. Springer (2014)
Indra Gandhi, M.P., et al.: Image registration quality assessment with similarity measures-a research study. In: 2015 International Conference on Communications and Signal Processing (ICCSP), pp. 0084–0088. IEEE (2015)
Black, M.J., Anandan, P.: The robust estimation of multiple motions: parametric and piecewise-smooth flow fields. Comput. Vis. Image Underst. 63(1), 75–104 (1996)
Anandan, P.: A computational framework and an algorithm for the measurement of visual motion. Int. J. Comput. Vis. (IJCV) 2(3), 283–310 (1989)
Kendall, A., Grimes, M., Cipolla, R.P.: A convolutional network for real-time 6-dof camera relocalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2938–2946 (2015)
Kendall, A., Cipolla, R., et al.: Geometric loss functions for camera pose regression with deep learning. In: Proceedings of the CVPR, Vol. 3, pp. 8 (2017)
Wu, J., Ma, L., Hu, X.: Delving deeper into convolutional neural networks for camera relocalization. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 5644–5651. IEEE (2017)
Schönberger, J.L., Pollefeys, M., Geiger, A., Sattler, T.: Semantic visual localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6896–6906 (2018)
Melekhov, I., Ylioinas, J., Kannala, J., Rahtu, E.: Relative camera pose estimation using convolutional neural networks. In: International Conference on Advanced Concepts for Intelligent Vision Systems, pp. 675–687. Springer (2017)
Costante, G., Mancini, M., Valigi, P., Ciarfuglia, T.A.: Exploring representation learning with cnns for frame-to-frame ego-motion estimation. IEEE Roboti. Autom. Lett. 1(1), 18–25 (2016)
Wang, S., Clark, R., Wen, H., Trigoni, N.D.: Towards end-to-end visual odometry with deep recurrent convolutional neural networks. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 2043–2050. IEEE (2017)
En, S., Lechervy, A., Jurie, F.: Rpnet: An End-to-End Network for Relative Camera Pose Estimation. Springer, Cham (2018)
Zhou, T., Brown, M., Snavely, N., Lowe, D.G.: Unsupervised learning of depth and ego-motion from video. In: CVPR, Vol. 2, pp. 7 (2017)
Li, R., Wang, S., Long, Z., Gu, D.: Undeepvo: monocular visual odometry through unsupervised deep learning. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 7286–7291. IEEE (2018)
Vijayanarasimhan, S., Ricco, S., Schmid, C., Sukthankar, R., Fragkiadaki, K.: Sfm-net: learning of structure and motion from video. arXiv:1704.07804, (2017)
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Mahjourian, R., Wicke, M., Angelova, A.: Unsupervised learning of depth and ego-motion from monocular video using 3d geometric constraints. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5667–5675 (2018)
Shen, T., Luo, Z., Zhou, L., Deng, H., Zhang, R., Fang, T., Quan, L.: Beyond photometric loss for self-supervised ego-motion estimation. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 6359–6365. IEEE (2019)
DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: Self-supervised interest point detection and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 224–236 (2018)
Han, X., Leung, T., Jia, Y., Sukthankar, R., Matchnet, A.C.B.: Unifying feature and metric learning for patch-based matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3279–3286 (2015)
Yang, N., Wang, R., Stuckler, J., Cremers, D.: Deep virtual stereo odometry: leveraging deep depth prediction for monocular direct sparse odometry. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 817–833 (2018)
Zhan, H., Weerasekera, C.S., Bian, J., Reid, I.: Visual odometry revisited: What should be learnt? arXiv:1909.09803 (2019)
Tang, J., Ambrus, R., Guizilini, V., Pillai, S., Kim, H., Gaidon, A.: Self-supervised 3d keypoint learning for ego-motion estimation. arXiv:1912.03426 (2019)
Tang, C., Tan, P.: Ba-net: dense bundle adjustment network. arXiv:1806.04807 (2018)
Lv, Z., Dellaert, F., Rehg, J.M., Geiger, A.: Taking a deeper look at the inverse compositional algorithm. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4581–4590 (2019)
Baker, S., Matthews, I.: Lucas-kanade 20 years on: a unifying framework. Int. J. Comput. Vis. 56(3), 221–255 (2004)
Li, Y., Wang, G., Ji, X., Xiang, Y., Fox, D.: Deepim: deep iterative matching for 6d pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 683–698 (2018)
Funding
This study was funded by the National Natural Science Foundation of China (grant number: 62103432).
Author information
Authors and Affiliations
Contributions
Tao Zhang designed the research. Wei wu and Bangjie Li processed the data. Shaopeng Li drafted the manuscript. Yong Xian helped organize the manuscript.
Corresponding author
Ethics declarations
Conflict of Interest
Shaopeng Li and Tao Zhang declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, S., Xian, Y., Wu, W. et al. Parameter-adaptive multi-frame joint pose optimization method. Vis Comput 39, 2529–2541 (2023). https://doi.org/10.1007/s00371-022-02476-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-022-02476-4