Fast and robust absolute camera pose estimation with known focal length

Cao, Ming Wei; Jia, Wei; Zhao, Yang; Li, Shu Jie; Liu, Xiao Ping

doi:10.1007/s00521-017-3032-6

Fast and robust absolute camera pose estimation with known focal length

Neural Computing in Next Generation Virtual Reality Technology
Published: 07 July 2017

Volume 29, pages 1383–1398, (2018)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Ming Wei Cao¹,
Wei Jia¹,
Yang Zhao¹,
Shu Jie Li¹ &
…
Xiao Ping Liu¹

1389 Accesses
19 Citations
Explore all metrics

Abstract

Some 3D computer vision techniques such as structure from motion (SFM) and augmented reality (AR) depend on a specific perspective-n-point (PnP) algorithm to estimate the absolute camera pose. However, existing PnP algorithms are difficult to achieve a good balance between accuracy and efficiency, and most of them do not make full use of the internal camera information such as focal length. In order to attack these drawbacks, we propose a fast and robust PnP (FRPnP) method to calculate the absolute camera pose for 3D compute vision. In the proposed FRPnP method, we firstly formulate the PnP problem as the optimization problem in the null space that can avoid the effects of the depth of each 3D point. Secondly, we can easily get the solution by the direct manner using singular value decomposition. Finally, the accurate information of camera pose can be obtained by optimization strategy. We explore four ways to evaluate the proposed FRPnP algorithm with synthetic dataset, real images, and apply it in the AR and SFM system. Experimental results show that the proposed FRPnP method can obtain the best balance between computational cost and precision, and clearly outperforms the state-of-the-art PnP methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fiducial Markers for Pose Estimation

Article 26 March 2021

Michail Kalaitzakis, Brennan Cain, … Nikolaos Vitzilaios

Deep Learning on Image Stitching With Multi-viewpoint Images: A Survey

Article 23 March 2023

Ni Yan, Yupeng Mei, … Yingyi Chen

An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics

Article 13 November 2015

Khalid Yousif, Alireza Bab-Hadiashar & Reza Hoseinnezhad

References

Nakano G (2016) A versatile approach for solving PnP, PnPf, and PnPfr problems. In: Computer Vision–ECCV 2016: 14th European Conference, pp. 338–352
Urban S, Leitloff J, Hinz S (2016) MLPNP—a real-time maximum likelihood solution to the perspective-N-point problem. ISPRS Ann Photogramm Remote Sens Spat Inf Sci 3(3):131–138
Article Google Scholar
Haner S, Astrom K (2015) Absolute pose for cameras under flat refractive interfaces. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1428–1436
Lv Z, Halawani A, Feng S, ur Réhman S, Li H (2015) Touch-less interactive augmented reality game on vision-based wearable device. Pers Ubiquit Comput 19(3):551–567
Article Google Scholar
Li Z, Wang Y, Guo J, Cheong L-F (2013) Diminished reality using appearance and 3D geometry of internet photo collections. IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp 11–19
Khan MSL, Réhman SU, Zhihan LV, Li H (2013) Head orientation modeling: geometric head pose estimation using monocular camera. The Ieee/iiae International Conference on Intelligent Systems and Image Processing, pp 149–153
Locher A, Perdoch M, Van Gool L (2016) Progressive prioritized multi-view stereo. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3244–3252
Bailer C, Finckh M, Lensch HP (2012) Scale robust multi view stereo. In: European Conference on Computer Vision (ECCV), pp 398–411
Lv Z, Halawani A, Feng S, Li H, Réhman SU (2014) Multimodal hand and foot gesture interaction for handheld devices. ACM Trans Multimed Comput Commun Appl 11(1):1–19
Article Google Scholar
Cadena C, Carlone L, Carrillo H, Latif Y, Scaramuzza D, Neira J, et al (2016) Simultaneous localization and mapping: present, future, and the robust-perception age. arXiv preprint arXiv:160605830
Kong C, Lucey S (2016) Prior-less compressible structure from motion. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4123–4131
Schönberger JL, Frahm J-M (2016) Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4104–4113
Crandall D, Owens A, Snavely N, Huttenlocher D (2011) Discrete-continuous optimization for large-scale structure from motion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3001–3008
Wu C (2013) Towards linear-time incremental structure from motion. In: Proceedings of the IEEE International Conference on 3D Vision (3DV), pp 127–134
Dong Z, Zhang G, Jia J, Bao H (2009) Keyframe-based real-time camera tracking. Computer Vision. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp 1538–1545
Shan Y, Liu Z, Zhang Z (2001) Model-based bundle adjustment with application to face modeling. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp 644–651
Jourabloo A. A survey of different 3D face reconstruction methods
Thomas D, Taniguchi R-I (2016) Augmented blendshapes for real-time simultaneous 3D head modeling and facial motion capture. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3299–3308
Sakurada K, Okatani T, Deguchi K (2013) Detecting changes in 3D structure of a scene from multi-view images captured by a vehicle-mounted camera. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 137–144
Arrigoni F, Rossi B, Malapelle F, Fragneto P, Fusiello A (2014) Robust global motion estimation with matrix completion. Int Arch Photogramm Remote Sens Spat Inf Sci 40(5):63–70
Article Google Scholar
Wang G, Chen X, Hu S (2014) Geometry-aware image completion via multiple examples. Eurographics Association, pp 97–100
Colbert M, Bouguet J-Y, Beis J, Childs S, Filip D, Vincent L (2012) Building indoor multi-panorama experiences at scale. ACM Siggraph Talks, pp 101–102
Zeisl B, Sattler T, Pollefeys M (2015) Camera pose voting for large-scale image-based localization. In: Proceedings of the IEEE International Conference on Computer Vision (CVPR), pp 2704–2712
Amorim N, Rocha JG (2016) State of art survey on: large scale image location recognition. International Conference on Computational Science and Its Applications, pp 375–385
Arandjelović R, Gronat P, Torii A, Pajdla T, Sivic J (2016) NetVLAD: CNN architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition (CVPR), pp 5297–5307
Daniilidis K (1998) Hand-eye calibration using dual quaternions. Int J Robot Res 18(3):286–298
Article Google Scholar
Song S, Chandraker M (2014) Robust scale estimation in real-time monocular SFM for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1566–1573
Lv Z, Chirivella J, Gagliardo P (2016) Bigdata oriented multimedia mobile health applications. J Med Syst 40(5):1–10
Article Google Scholar
Zhang X, Han Y, Hao D, Lv Z (2016) ARGIS-based outdoor underground pipeline information system. J Vis Commun Image Represent (VCIP) 40:779–790
Article Google Scholar
Ansar A, Daniilidis K (2003) Linear pose estimation from points or lines. IEEE Trans Pattern Anal Mach Intell 25(5):578–589
Article MATH Google Scholar
Hartley R, Zisserman A (2003) Multiple view geometry in computer vision: Cambridge university press
Kneip L, Scaramuzza D, Siegwart R (2011) A novel parameterization of the perspective-three-point problem for a direct computation of absolute camera position and orientation. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp 2969–2976
Josephson K, Byrod M (2009) Pose estimation with radial distortion and unknown focal length. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp 2419–2426
Hesch JA, Roumeliotis SI (2011) A direct least-squares (DLS) method for PnP. In: Internal Conference on Computer Vision (ICCV), pp 383–390
Sweeney C, Fragoso V, Höllerer T, Turk M (2014) gdls: a scalable solution to the generalized pose and scale problem. European Conference on Computer Vision (ECCV), pp 16–31
Li S, Xu C, Xie M (2012) A robust O (n) solution to the perspective-n-point problem. IEEE Trans Pattern Anal Mach Intell 34(7):1444–1450
Article Google Scholar
Zheng Y, Kuang Y, Sugimoto S, Astrom K, Okutomi M (2013) Revisiting the PnP problem: a fast, general and optimal solution. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 2344–2351
Zheng Y, Sugimoto S, Okutomi M (2013) AsPnP: an accurate and scalable solution to the perspective-n-point problem. IEICE Trans Inf Syst 96(7):1525–1535
Article Google Scholar
Lu C-P, Hager GD, Mjolsness E (2000) Fast and globally convergent pose estimation from video images. IEEE Trans Pattern Anal Mach Intell 22(6):610–622
Article Google Scholar
Garro V, Crosilla F, Fusiello A (2012) Solving the PnP problem with anisotropic orthogonal Procrustes analysis. 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission, pp 262–269
Ferraz L, Binefa X, Moreno-Noguer F (2014) Very fast solution to the PnP problem with algebraic outlier rejection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 501–508
Penate-Sanchez A, Andrade-Cetto J, Moreno-Noguer F (2013) Exhaustive linearization for robust camera pose and focal length estimation. IEEE Trans Pattern Anal Mach Intell 35(10):2387–2400
Article Google Scholar
Ferraz L, Binefa X, Moreno-Noguer F (2014) Leveraging feature uncertainty in the PnP problem. British Machine Vision Conference (BMVC), pp 10–23
Schweighofer G, Pinz A (2008) Globally optimal O (n) solution to the PnP problem for general camera models. British Machine Vision Conference (BMVC), pp 1–10
Kahl F, Henrion D (2007) Globally optimal estimates for geometric reconstruction problems. Int J Comput Vis 74(1):3–15
Article Google Scholar
Moreno-Noguer F, Lepetit V, Fua P (2007) Accurate non-iterative o (n) solution to the PnP problem. In: internal conference on computer vision (ICCV), pp 1–8
Sweeney C, Flynn J, Nuernberger B, Turk M, Hollerer T (2015) Efficient computation of absolute pose for gravity-aware augmented reality. In: IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp 19–24
Garrido-Jurado S, Muñoz-Salinas R, Madrid-Cuevas FJ, Marín-Jiménez MJ (2014) Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recogn 47(6):2280–2292
Article Google Scholar
Garrido-Jurado S, Muñoz-Salinas R, Madrid-Cuevas FJ, Medina-Carnicer R (2016) Generation of fiducial marker dictionaries using mixed integer linear programming. Pattern Recogn 51:481–491
Article Google Scholar
Höllerer T, Feiner S, Terauchi T, Rashid G, Hallaway D (1999) Exploring MARS: developing indoor and outdoor user interfaces to a mobile augmented reality system. Comput Graph 23(6):779–785
Article Google Scholar
Wojciechowski R, Cellary W (2013) Evaluation of learners’ attitude toward learning in ARIES augmented reality environments. Comput Educ 68:570–585
Article Google Scholar
Di Serio Á, Ibáñez MB, Kloos CD (2013) Impact of an augmented reality system on students’ motivation for a visual art course. Comput Educ 68:586–596
Article Google Scholar
Ong SK, Nee AYC (2013) Virtual and augmented reality applications in manufacturing. Springer, New York
Google Scholar
Dunleavy M, Dede C (2014) Augmented reality teaching and learning. In: Handbook of research on educational communications and technology, pp 735–745
Snavely N, Seitz SM, Szeliski R (2006) Photo tourism: exploring photo collections in 3D. ACM transactions on graphics (TOG); pp 835–846
Zach C. ETH-V3D structure-and-motion software.© 2010–2011. ETH Zurich. 2010
Wu C. SiftGPU: a GPU implementation of scale invariant feature transform. URL http://cs.unc edu/~ ccwu/siftgpu. 2011
Zhang G, Liu H, Dong Z, Jia J, Wong T-T, Bao H (2015) ENFT: efficient non-consecutive feature tracking for robust structure-from-motion. arXiv preprint arXiv:151008012. 2015.
Ni K, Dellaert F (2012) HyperSfM. 2012 second international conference on 3D imaging, modeling, processing, visualization and transmission (3DIMPVT), pp 144–151
Xiao J, Owens A, Torralba A, (2013) SUN3D: a database of big spaces reconstructed using SFM and object labels. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 1625–1632
Moulon P, Monasse P, Marlet R (2013) Global fusion of relative motions for robust, accurate and scalable structure from motion. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 3248–3255
Alcantarilla PF, Bartoli A, Davison AJ (2012) KAZE features. In: European Conference on Computer Vision (ECCV), pp 214–227
Sweeney C, Sattler T, Hollerer T, Turk M, Pollefeys M (2015) Optimizing the viewing graph for structure-from-motion. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 801–809
Wilson K, Snavely N (2014) Robust global translations with 1DSFM. In: European Conference on Computer Vision (ECCV), pp 61–75
Bao SY, Savarese S (2011) Semantic structure from motion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2025–2032
Wang TY, Kohli P, Mitra NJ (2015) Dynamic SFM: detecting scene changes from image pairs. Comput Graphics Forum 34(5):177–189
Article Google Scholar
Zheng E, Wu C (2015) Structure from motion using structure-less resection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 2075–2083
Haralick BM, Lee C-N, Ottenberg K, Nölle M (1994) Review and analysis of solutions of the three point perspective pose estimation problem. Int J Comput Vis 13(3):331–356
Article Google Scholar
Horaud R, Conio B, Leboulleux O, Lacolle B (1989) An analytic solution for the perspective 4-point problem. Elsevier, B. Comput Vision Graph Image Process 47:33–44
Article Google Scholar
Wu Y, Hu Z (2006) PnP problem revisited. J Math Imaging Vision 24(1):131–141
Article MathSciNet Google Scholar
Hu ZY, Wu FC (2002) A note on the number of solutions of the noncoplanar P4P problem. IEEE Trans Pattern Anal Mach Intell 24(4):550–555
Article Google Scholar
Zhang L, Xu C, Lee K-M, Koch R (2012) Robust and efficient pose estimation from line correspondences. In: Asia Conference on Computer Vision (ACCV), pp 217–230
Ventura J, Arth C, Reitmayr G, Schmalstieg D (2014) A minimal solution to the generalized pose-and-scale problem. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 422–429
Kneip L, Li H, Seo Y (2014) UPnP: an optimal o (n) solution to the absolute pose problem with universal applicability. In: European Conference on Computer Vision (ECCV), pp 127–142
Bushnevskiy A, Sorgi L, Rosenhahn B (2016) Multicamera calibration from visible and mirrored epipoles. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3373–3381
Förstner W (2010) Minimal representations for uncertainty and estimation in projective spaces. Asian Conference on Computer Vision (ACCV), pp 619–632
Boyd S, Vandenberghe L (2004) Convex optimization: Cambridge University press
Wu C, Agarwal S, Curless B, Seitz SM (2011) Multicore bundle adjustment. In: proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3057–3064
Huber PJ (1973) Robust regression: asymptotics, conjectures and Monte Carlo. Ann Stat 1:799–821
Article MathSciNet MATH Google Scholar
Levenberg K (1994) A method for the solution of certain non–linear problems in least squares. J Heart Lung Tansplant 2(4):436–438
MathSciNet Google Scholar
Xi Y, Xia J, Chan R (2014) A fast randomized eigensolver with structured LDL factorization update. SIAM J Matrix Anal Appl 35(3):974–996
Article MathSciNet MATH Google Scholar
Cheng J, Leng C, Wu J, Cui H, Lu H (2014) Fast and accurate image matching with cascade hashing for 3D reconstruction. In: proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1–8
Nistér D (2004) An efficient solution to the five-point relative pose problem. IEEE Trans Pattern Anal Mach Intell 26(6):756–770
Article Google Scholar
Yang X, Cheng K-T (2012) LDB: an ultra-fast feature for scalable augmented reality on mobile devices. 2012 I.E. International Symposium on Mixed and Augmented Reality (ISMAR), pp 49–57
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Li W, Cosker D, Lv Z, Brown M (2016) Nonrigid optical flow ground truth for real-world scenes with time-varying shading effects. IEEE Robot Autom Lett 2(1):231–238
Article Google Scholar
Heinly J, Dunn E, Frahm JM (2014) Correcting for duplicate scene structure in sparse 3D reconstruction. In: European Conference on Computer Vision (ECCV), pp 780–795
Chandrasekhar VR, Chen DM, Tsai SS, Cheung NM, Chen H, Takacs G (2011) The Stanford mobile visual search data set. ACM Sigmm Conference on Multimedia Systems, pp 117–122

Download references

Acknowledgements

This work is supported by the grants of the National Science Foundation of China (Nos. 61370167, 61673157, 61402018, and 61305093), the National Key Research and Development Plan under Grant No. 2016YFC0800100, and also supported by the grants of the Natural Science Foundation of Anhui Province (Nos. KJ2014ZD27, JZ2015AKZR0664, and 1604e0302001). The authors would like to thank anonymous reviewers for their helpful and constructive comments that greatly improved the paper.

Author information

Authors and Affiliations

School of Computer and Information, Hefei University of Technology, Hefei, 230009, China
Ming Wei Cao, Wei Jia, Yang Zhao, Shu Jie Li & Xiao Ping Liu

Authors

Ming Wei Cao
View author publications
You can also search for this author in PubMed Google Scholar
Wei Jia
View author publications
You can also search for this author in PubMed Google Scholar
Yang Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Shu Jie Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Ping Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiao Ping Liu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cao, M.W., Jia, W., Zhao, Y. et al. Fast and robust absolute camera pose estimation with known focal length. Neural Comput & Applic 29, 1383–1398 (2018). https://doi.org/10.1007/s00521-017-3032-6

Download citation

Received: 22 December 2016
Accepted: 25 April 2017
Published: 07 July 2017
Issue Date: March 2018
DOI: https://doi.org/10.1007/s00521-017-3032-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast and robust absolute camera pose estimation with known focal length

Abstract

Access this article

Similar content being viewed by others

Fiducial Markers for Pose Estimation

Deep Learning on Image Stitching With Multi-viewpoint Images: A Survey

An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

Fiducial Markers for Pose Estimation

Deep Learning on Image Stitching With Multi-viewpoint Images: A Survey

An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation