Skip to main content
Log in

Fast and robust absolute camera pose estimation with known focal length

  • Neural Computing in Next Generation Virtual Reality Technology
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Some 3D computer vision techniques such as structure from motion (SFM) and augmented reality (AR) depend on a specific perspective-n-point (PnP) algorithm to estimate the absolute camera pose. However, existing PnP algorithms are difficult to achieve a good balance between accuracy and efficiency, and most of them do not make full use of the internal camera information such as focal length. In order to attack these drawbacks, we propose a fast and robust PnP (FRPnP) method to calculate the absolute camera pose for 3D compute vision. In the proposed FRPnP method, we firstly formulate the PnP problem as the optimization problem in the null space that can avoid the effects of the depth of each 3D point. Secondly, we can easily get the solution by the direct manner using singular value decomposition. Finally, the accurate information of camera pose can be obtained by optimization strategy. We explore four ways to evaluate the proposed FRPnP algorithm with synthetic dataset, real images, and apply it in the AR and SFM system. Experimental results show that the proposed FRPnP method can obtain the best balance between computational cost and precision, and clearly outperforms the state-of-the-art PnP methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Nakano G (2016) A versatile approach for solving PnP, PnPf, and PnPfr problems. In: Computer Vision–ECCV 2016: 14th European Conference, pp. 338–352

  2. Urban S, Leitloff J, Hinz S (2016) MLPNP—a real-time maximum likelihood solution to the perspective-N-point problem. ISPRS Ann Photogramm Remote Sens Spat Inf Sci 3(3):131–138

    Article  Google Scholar 

  3. Haner S, Astrom K (2015) Absolute pose for cameras under flat refractive interfaces. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1428–1436

  4. Lv Z, Halawani A, Feng S, ur Réhman S, Li H (2015) Touch-less interactive augmented reality game on vision-based wearable device. Pers Ubiquit Comput 19(3):551–567

    Article  Google Scholar 

  5. Li Z, Wang Y, Guo J, Cheong L-F (2013) Diminished reality using appearance and 3D geometry of internet photo collections. IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp 11–19

  6. Khan MSL, Réhman SU, Zhihan LV, Li H (2013) Head orientation modeling: geometric head pose estimation using monocular camera. The Ieee/iiae International Conference on Intelligent Systems and Image Processing, pp 149–153

  7. Locher A, Perdoch M, Van Gool L (2016) Progressive prioritized multi-view stereo. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3244–3252

  8. Bailer C, Finckh M, Lensch HP (2012) Scale robust multi view stereo. In: European Conference on Computer Vision (ECCV), pp 398–411

  9. Lv Z, Halawani A, Feng S, Li H, Réhman SU (2014) Multimodal hand and foot gesture interaction for handheld devices. ACM Trans Multimed Comput Commun Appl 11(1):1–19

    Article  Google Scholar 

  10. Cadena C, Carlone L, Carrillo H, Latif Y, Scaramuzza D, Neira J, et al (2016) Simultaneous localization and mapping: present, future, and the robust-perception age. arXiv preprint arXiv:160605830

  11. Kong C, Lucey S (2016) Prior-less compressible structure from motion. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4123–4131

  12. Schönberger JL, Frahm J-M (2016) Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4104–4113

  13. Crandall D, Owens A, Snavely N, Huttenlocher D (2011) Discrete-continuous optimization for large-scale structure from motion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3001–3008

  14. Wu C (2013) Towards linear-time incremental structure from motion. In: Proceedings of the IEEE International Conference on 3D Vision (3DV), pp 127–134

  15. Dong Z, Zhang G, Jia J, Bao H (2009) Keyframe-based real-time camera tracking. Computer Vision. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp 1538–1545

  16. Shan Y, Liu Z, Zhang Z (2001) Model-based bundle adjustment with application to face modeling. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp 644–651

  17. Jourabloo A. A survey of different 3D face reconstruction methods

  18. Thomas D, Taniguchi R-I (2016) Augmented blendshapes for real-time simultaneous 3D head modeling and facial motion capture. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3299–3308

  19. Sakurada K, Okatani T, Deguchi K (2013) Detecting changes in 3D structure of a scene from multi-view images captured by a vehicle-mounted camera. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 137–144

  20. Arrigoni F, Rossi B, Malapelle F, Fragneto P, Fusiello A (2014) Robust global motion estimation with matrix completion. Int Arch Photogramm Remote Sens Spat Inf Sci 40(5):63–70

    Article  Google Scholar 

  21. Wang G, Chen X, Hu S (2014) Geometry-aware image completion via multiple examples. Eurographics Association, pp 97–100

  22. Colbert M, Bouguet J-Y, Beis J, Childs S, Filip D, Vincent L (2012) Building indoor multi-panorama experiences at scale. ACM Siggraph Talks, pp 101–102

  23. Zeisl B, Sattler T, Pollefeys M (2015) Camera pose voting for large-scale image-based localization. In: Proceedings of the IEEE International Conference on Computer Vision (CVPR), pp 2704–2712

  24. Amorim N, Rocha JG (2016) State of art survey on: large scale image location recognition. International Conference on Computational Science and Its Applications, pp 375–385

  25. Arandjelović R, Gronat P, Torii A, Pajdla T, Sivic J (2016) NetVLAD: CNN architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition (CVPR), pp 5297–5307

  26. Daniilidis K (1998) Hand-eye calibration using dual quaternions. Int J Robot Res 18(3):286–298

    Article  Google Scholar 

  27. Song S, Chandraker M (2014) Robust scale estimation in real-time monocular SFM for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1566–1573

  28. Lv Z, Chirivella J, Gagliardo P (2016) Bigdata oriented multimedia mobile health applications. J Med Syst 40(5):1–10

    Article  Google Scholar 

  29. Zhang X, Han Y, Hao D, Lv Z (2016) ARGIS-based outdoor underground pipeline information system. J Vis Commun Image Represent (VCIP) 40:779–790

    Article  Google Scholar 

  30. Ansar A, Daniilidis K (2003) Linear pose estimation from points or lines. IEEE Trans Pattern Anal Mach Intell 25(5):578–589

    Article  MATH  Google Scholar 

  31. Hartley R, Zisserman A (2003) Multiple view geometry in computer vision: Cambridge university press

  32. Kneip L, Scaramuzza D, Siegwart R (2011) A novel parameterization of the perspective-three-point problem for a direct computation of absolute camera position and orientation. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp 2969–2976

  33. Josephson K, Byrod M (2009) Pose estimation with radial distortion and unknown focal length. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp 2419–2426

  34. Hesch JA, Roumeliotis SI (2011) A direct least-squares (DLS) method for PnP. In: Internal Conference on Computer Vision (ICCV), pp 383–390

  35. Sweeney C, Fragoso V, Höllerer T, Turk M (2014) gdls: a scalable solution to the generalized pose and scale problem. European Conference on Computer Vision (ECCV), pp 16–31

  36. Li S, Xu C, Xie M (2012) A robust O (n) solution to the perspective-n-point problem. IEEE Trans Pattern Anal Mach Intell 34(7):1444–1450

    Article  Google Scholar 

  37. Zheng Y, Kuang Y, Sugimoto S, Astrom K, Okutomi M (2013) Revisiting the PnP problem: a fast, general and optimal solution. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 2344–2351

  38. Zheng Y, Sugimoto S, Okutomi M (2013) AsPnP: an accurate and scalable solution to the perspective-n-point problem. IEICE Trans Inf Syst 96(7):1525–1535

    Article  Google Scholar 

  39. Lu C-P, Hager GD, Mjolsness E (2000) Fast and globally convergent pose estimation from video images. IEEE Trans Pattern Anal Mach Intell 22(6):610–622

    Article  Google Scholar 

  40. Garro V, Crosilla F, Fusiello A (2012) Solving the PnP problem with anisotropic orthogonal Procrustes analysis. 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission, pp 262–269

  41. Ferraz L, Binefa X, Moreno-Noguer F (2014) Very fast solution to the PnP problem with algebraic outlier rejection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 501–508

  42. Penate-Sanchez A, Andrade-Cetto J, Moreno-Noguer F (2013) Exhaustive linearization for robust camera pose and focal length estimation. IEEE Trans Pattern Anal Mach Intell 35(10):2387–2400

    Article  Google Scholar 

  43. Ferraz L, Binefa X, Moreno-Noguer F (2014) Leveraging feature uncertainty in the PnP problem. British Machine Vision Conference (BMVC), pp 10–23

  44. Schweighofer G, Pinz A (2008) Globally optimal O (n) solution to the PnP problem for general camera models. British Machine Vision Conference (BMVC), pp 1–10

  45. Kahl F, Henrion D (2007) Globally optimal estimates for geometric reconstruction problems. Int J Comput Vis 74(1):3–15

    Article  Google Scholar 

  46. Moreno-Noguer F, Lepetit V, Fua P (2007) Accurate non-iterative o (n) solution to the PnP problem. In: internal conference on computer vision (ICCV), pp 1–8

  47. Sweeney C, Flynn J, Nuernberger B, Turk M, Hollerer T (2015) Efficient computation of absolute pose for gravity-aware augmented reality. In: IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp 19–24

  48. Garrido-Jurado S, Muñoz-Salinas R, Madrid-Cuevas FJ, Marín-Jiménez MJ (2014) Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recogn 47(6):2280–2292

    Article  Google Scholar 

  49. Garrido-Jurado S, Muñoz-Salinas R, Madrid-Cuevas FJ, Medina-Carnicer R (2016) Generation of fiducial marker dictionaries using mixed integer linear programming. Pattern Recogn 51:481–491

    Article  Google Scholar 

  50. Höllerer T, Feiner S, Terauchi T, Rashid G, Hallaway D (1999) Exploring MARS: developing indoor and outdoor user interfaces to a mobile augmented reality system. Comput Graph 23(6):779–785

    Article  Google Scholar 

  51. Wojciechowski R, Cellary W (2013) Evaluation of learners’ attitude toward learning in ARIES augmented reality environments. Comput Educ 68:570–585

    Article  Google Scholar 

  52. Di Serio Á, Ibáñez MB, Kloos CD (2013) Impact of an augmented reality system on students’ motivation for a visual art course. Comput Educ 68:586–596

    Article  Google Scholar 

  53. Ong SK, Nee AYC (2013) Virtual and augmented reality applications in manufacturing. Springer, New York

    Google Scholar 

  54. Dunleavy M, Dede C (2014) Augmented reality teaching and learning. In: Handbook of research on educational communications and technology, pp 735–745

  55. Snavely N, Seitz SM, Szeliski R (2006) Photo tourism: exploring photo collections in 3D. ACM transactions on graphics (TOG); pp 835–846

  56. Zach C. ETH-V3D structure-and-motion software.© 2010–2011. ETH Zurich. 2010

  57. Wu C. SiftGPU: a GPU implementation of scale invariant feature transform. URL http://cs.unc edu/~ ccwu/siftgpu. 2011

  58. Zhang G, Liu H, Dong Z, Jia J, Wong T-T, Bao H (2015) ENFT: efficient non-consecutive feature tracking for robust structure-from-motion. arXiv preprint arXiv:151008012. 2015.

  59. Ni K, Dellaert F (2012) HyperSfM. 2012 second international conference on 3D imaging, modeling, processing, visualization and transmission (3DIMPVT), pp 144–151

  60. Xiao J, Owens A, Torralba A, (2013) SUN3D: a database of big spaces reconstructed using SFM and object labels. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 1625–1632

  61. Moulon P, Monasse P, Marlet R (2013) Global fusion of relative motions for robust, accurate and scalable structure from motion. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 3248–3255

  62. Alcantarilla PF, Bartoli A, Davison AJ (2012) KAZE features. In: European Conference on Computer Vision (ECCV), pp 214–227

  63. Sweeney C, Sattler T, Hollerer T, Turk M, Pollefeys M (2015) Optimizing the viewing graph for structure-from-motion. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 801–809

  64. Wilson K, Snavely N (2014) Robust global translations with 1DSFM. In: European Conference on Computer Vision (ECCV), pp 61–75

  65. Bao SY, Savarese S (2011) Semantic structure from motion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2025–2032

  66. Wang TY, Kohli P, Mitra NJ (2015) Dynamic SFM: detecting scene changes from image pairs. Comput Graphics Forum 34(5):177–189

    Article  Google Scholar 

  67. Zheng E, Wu C (2015) Structure from motion using structure-less resection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 2075–2083

  68. Haralick BM, Lee C-N, Ottenberg K, Nölle M (1994) Review and analysis of solutions of the three point perspective pose estimation problem. Int J Comput Vis 13(3):331–356

    Article  Google Scholar 

  69. Horaud R, Conio B, Leboulleux O, Lacolle B (1989) An analytic solution for the perspective 4-point problem. Elsevier, B. Comput Vision Graph Image Process 47:33–44

    Article  Google Scholar 

  70. Wu Y, Hu Z (2006) PnP problem revisited. J Math Imaging Vision 24(1):131–141

    Article  MathSciNet  Google Scholar 

  71. Hu ZY, Wu FC (2002) A note on the number of solutions of the noncoplanar P4P problem. IEEE Trans Pattern Anal Mach Intell 24(4):550–555

    Article  Google Scholar 

  72. Zhang L, Xu C, Lee K-M, Koch R (2012) Robust and efficient pose estimation from line correspondences. In: Asia Conference on Computer Vision (ACCV), pp 217–230

  73. Ventura J, Arth C, Reitmayr G, Schmalstieg D (2014) A minimal solution to the generalized pose-and-scale problem. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 422–429

  74. Kneip L, Li H, Seo Y (2014) UPnP: an optimal o (n) solution to the absolute pose problem with universal applicability. In: European Conference on Computer Vision (ECCV), pp 127–142

  75. Bushnevskiy A, Sorgi L, Rosenhahn B (2016) Multicamera calibration from visible and mirrored epipoles. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3373–3381

  76. Förstner W (2010) Minimal representations for uncertainty and estimation in projective spaces. Asian Conference on Computer Vision (ACCV), pp 619–632

  77. Boyd S, Vandenberghe L (2004) Convex optimization: Cambridge University press

  78. Wu C, Agarwal S, Curless B, Seitz SM (2011) Multicore bundle adjustment. In: proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3057–3064

  79. Huber PJ (1973) Robust regression: asymptotics, conjectures and Monte Carlo. Ann Stat 1:799–821

    Article  MathSciNet  MATH  Google Scholar 

  80. Levenberg K (1994) A method for the solution of certain non–linear problems in least squares. J Heart Lung Tansplant 2(4):436–438

    MathSciNet  Google Scholar 

  81. Xi Y, Xia J, Chan R (2014) A fast randomized eigensolver with structured LDL factorization update. SIAM J Matrix Anal Appl 35(3):974–996

    Article  MathSciNet  MATH  Google Scholar 

  82. Cheng J, Leng C, Wu J, Cui H, Lu H (2014) Fast and accurate image matching with cascade hashing for 3D reconstruction. In: proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1–8

  83. Nistér D (2004) An efficient solution to the five-point relative pose problem. IEEE Trans Pattern Anal Mach Intell 26(6):756–770

    Article  Google Scholar 

  84. Yang X, Cheng K-T (2012) LDB: an ultra-fast feature for scalable augmented reality on mobile devices. 2012 I.E. International Symposium on Mixed and Augmented Reality (ISMAR), pp 49–57

  85. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110

    Article  Google Scholar 

  86. Li W, Cosker D, Lv Z, Brown M (2016) Nonrigid optical flow ground truth for real-world scenes with time-varying shading effects. IEEE Robot Autom Lett 2(1):231–238

    Article  Google Scholar 

  87. Heinly J, Dunn E, Frahm JM (2014) Correcting for duplicate scene structure in sparse 3D reconstruction. In: European Conference on Computer Vision (ECCV), pp 780–795

  88. Chandrasekhar VR, Chen DM, Tsai SS, Cheung NM, Chen H, Takacs G (2011) The Stanford mobile visual search data set. ACM Sigmm Conference on Multimedia Systems, pp 117–122

Download references

Acknowledgements

This work is supported by the grants of the National Science Foundation of China (Nos. 61370167, 61673157, 61402018, and 61305093), the National Key Research and Development Plan under Grant No. 2016YFC0800100, and also supported by the grants of the Natural Science Foundation of Anhui Province (Nos. KJ2014ZD27, JZ2015AKZR0664, and 1604e0302001). The authors would like to thank anonymous reviewers for their helpful and constructive comments that greatly improved the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiao Ping Liu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cao, M.W., Jia, W., Zhao, Y. et al. Fast and robust absolute camera pose estimation with known focal length. Neural Comput & Applic 29, 1383–1398 (2018). https://doi.org/10.1007/s00521-017-3032-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-017-3032-6

Keywords

Navigation