Skip to main content
Log in

Camera focal length from distances in a single image

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Camera focal length estimation from a single image is of great importance for many computer vision tasks. Unfortunately, previous methods cannot achieve satisfactory accuracy yet. This paper proposes a focal length estimation approach based on distances among four points in the scene of a single image. The issue is cast into a nonlinear optimization by adapting a standard pinhole camera model under the constraints of distance information. Multiple algorithms are employed to solve the optimization, and the best solution is then regarded as the final solution. Experimental results show that our method is able to obtain a more accurate focal length than some state of the art in a single image setting. In addition, we provide some simple application examples and show the intuitive effects of focal length estimation errors. We also demonstrate experimentally that distance information has an improved meaning for the solution of the focal length.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Abbas, S.A., Zisserman, A.: A Geometric Approach to Obtain a Bird’s Eye View From an Image. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 4095–4104. IEEE, Seoul, Korea (South) (2019). 10.1109/ICCVW.2019.00504

  2. Barreto, J.P.: A unifying geometric representation for central projection systems. Computer Vision Image Understand. 103(3), 208–217 (2006). https://doi.org/10.1016/j.cviu.2006.06.003

    Article  Google Scholar 

  3. Bogdan, O., Eckstein, V., Rameau, F., Bazin, J.C.: DeepCalib: A deep learning approach for automatic intrinsic calibration of wide field-of-view cameras. In: Proceedings of the 15th ACM SIGGRAPH European Conference on Visual Media Production, CVMP ’18, pp. 1–10. Association for Computing Machinery, London, United Kingdom (2018). 10.1145/3278471.3278479

  4. Cao, Y.T., Wang, J.M., Sun, Y.K., Duan, X.J.: Circle Marker Based Distance Measurement Using a Single Camera. LNSE pp. 376–380 (2013). 10.7763/LNSE.2013.V1.80

  5. Caprile, B., Torre, V.: Using vanishing points for camera calibration. Int. J. Comput. Vision 4(2), 127–139 (1990). https://doi.org/10.1007/BF00127813

    Article  Google Scholar 

  6. Chen, H.T.: Geometry-based camera calibration using five-point correspondences from a single image. IEEE Trans. Circuits Syst. Video Technol. 27(12), 2555–2566 (2017). https://doi.org/10.1109/TCSVT.2016.2595319

    Article  Google Scholar 

  7. Chen, Q., Wu, H., Wada, T.: Camera calibration with two arbitrary coplanar circles. In: Pajdla, T., Matas, J. (eds.) Computer vision - ECCV (2004). Springer, Berlin Heidelberg (2004)

    Google Scholar 

  8. Conn, A.R., Gould, N.I.M., Toint, P.L.: Trust region methods. Soc. Indus. Appl. Math. (2000). https://doi.org/10.1137/1.9780898719857

    Article  MATH  Google Scholar 

  9. Coughlan, J., Yuille, A.: Manhattan World: Compass direction from a single image by Bayesian inference. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 941–947. IEEE, Kerkyra, Greece (1999). 10.1109/ICCV.1999.790349

  10. Deutscher, J., Isard, M., MacCormick, J.: Automatic camera calibration from a single Manhattan image. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) Computer vision – ECCV, 2002. Springer, Berlin Heidelberg (2002)

    Google Scholar 

  11. Godard, C., Aodha, O.M., Firman, M., Brostow, G.: Digging Into Self-Supervised Monocular Depth Estimation. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3827–3837. IEEE, Seoul, Korea (South) (2019). 10.1109/ICCV.2019.00393

  12. Gopalan, R., Li, R., Chellappa, R.: Domain adaptation for object recognition: An unsupervised approach. In: 2011 International Conference on Computer Vision, pp. 999–1006. IEEE, Barcelona, Spain (2011). 10.1109/ICCV.2011.6126344

  13. Gordon, A., Li, H., Jonschkowski, R., Angelova, A.: Depth From Videos in the Wild: Unsupervised Monocular Depth Learning From Unknown Cameras. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8976–8985. IEEE, Seoul, Korea (South) (2019). 10.1109/ICCV.2019.00907

  14. Hestenes, M., Stiefel, E.: Methods of conjugate gradients for solving linear systems. J. Res. Nat. Bur. Stan. 49(6), 409 (1952). https://doi.org/10.6028/jres.049.044

    Article  MathSciNet  MATH  Google Scholar 

  15. Hold-Geoffroy, Y., Sunkavalli, K., Eisenmann, J., Fisher, M., Gambaretto, E., Hadap, S., Lalonde, J.F.: A Perceptual Measure for Deep Single Image Camera Calibration. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2354–2363. IEEE, Salt Lake City, UT, USA (2018). 10.1109/CVPR.2018.00250

  16. Jancosek, M., Pajdla, T.: Multi-view reconstruction preserving weakly-supported surfaces. In: CVPR 2011, pp. 3121–3128. IEEE, Colorado Springs, CO, USA (2011). 10.1109/CVPR.2011.5995693

  17. Jiang, G., Quan, L.: Detection of concentric circles for camera calibration. In: Tenth IEEE International Conference on Computer Vision (ICCV’05), vol. 1, pp. 333–340. IEEE, Beijing, China (2005). 10.1109/ICCV.2005.73

  18. Levenberg, K.: A method for the solution of certain non-linear problems in least squares. Quart. Appl. Math. 2(2), 164–168 (1944). https://doi.org/10.1090/qam/10666

    Article  MathSciNet  MATH  Google Scholar 

  19. Li, B., Peng, K., Ying, X., Zha, H.: Simultaneous vanishing point detection and camera calibration from single images. In: Bebis, G., Boyle, R., Parvin, B., Koracin, D., Chung, R., Hammound, R., Hussain, M., Kar-Han, T., Crawfis, R., Thalmann, D., Kao, D., Avila, L. (eds.) Advances in visual computing. Springer, Berlin Heidelberg (2010)

    Google Scholar 

  20. Miyagawa, I., Arai, H., Koike, H.: Simple camera calibration from a single image using five points on two orthogonal 1-D objects. IEEE Trans. on Image Process. 19(6), 1528–1538 (2010). https://doi.org/10.1109/TIP.2010.2042118

    Article  MathSciNet  MATH  Google Scholar 

  21. Moulon, P., Monasse, P., Marlet, R.: Adaptive structure from motion with a contrario model estimation. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) Computer vision - ACCV. Springer, Berlin Heidelberg (2013)

    Google Scholar 

  22. Ricolfe-Viala, C., Sánchez-Salmerón, A.J.: Robust metric calibration of non-linear camera lens distortion. Pattern Recogn. 43(4), 1688–1699 (2010). https://doi.org/10.1016/j.patcog.2009.10.003

  23. Virtanen, P., et al.: SciPy 1.0 Fundamental algorithms for scientific computing in Python. Nat. Methods 14(3), 261–272 (2020). https://doi.org/10.1038/s41592-019-0686-2

    Article  Google Scholar 

  24. Song, W., Wang, Y., Li, H.X., Cai, Z.: Locating multiple optimal solutions of nonlinear equation systems based on multiobjective optimization. IEEE Trans. Evol. Computat. 19(3), 414–431 (2015). https://doi.org/10.1109/TEVC.2014.2336865

    Article  Google Scholar 

  25. Storn, R., Price, K.: Differential evolution - a simple and efficient heuristic for global optimization over continuous spaces. J. Global Optim. 11(4), 341–359 (1997). https://doi.org/10.1023/A:1008202821328

    Article  MathSciNet  MATH  Google Scholar 

  26. Torralba, Murphy, Freeman, Rubin: Context-based vision system for place and object recognition. In: Proceedings Ninth IEEE International Conference on Computer Vision, vol. 1, pp. 273–280. IEEE, Nice, France (2003). 10.1109/ICCV.2003.1238354

  27. Tsai, R.: A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses. IEEE J. Robot. Automat. 3(4), 323–344 (1987). https://doi.org/10.1109/JRA.1987.1087109

    Article  Google Scholar 

  28. Wales, D.J., Doye, J.P.K.: Global optimization by basin-hopping and the lowest energy structures of lennard-jones clusters containing up to 110 atoms. J. Phys. Chem. A 101(28), 5111–5116 (1997). https://doi.org/10.1021/jp970984n

    Article  Google Scholar 

  29. Wildenauer, H., Hanbury, A.: Robust camera self-calibration from monocular images of Manhattan worlds. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2831–2838. IEEE, Providence, RI (2012). 10.1109/CVPR.2012.6248008

  30. Workman, S., Greenwell, C., Zhai, M., Baltenberger, R., Jacobs, N.: DEEPFOCAL: A method for direct focal length estimation. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 1369–1373. IEEE, Quebec City, QC, Canada (2015). 10.1109/ICIP.2015.7351024

  31. Yan, H., Zhang, Y., Zhang, S., Zhao, S., Zhang, L.: Focal length estimation guided with object distribution on FocaLens dataset. J. Electron Imag. 26(3), 018–033 (2017). https://doi.org/10.1117/1.JEI.26.3.033018

    Article  Google Scholar 

  32. Yin, W., Zhang, J., Wang, O., Niklaus, S., Mai, L., Chen, S., Shen, C.: Learning to Recover 3D Scene Shape from a Single Image. arXiv:2012.09365 [cs] (2020). http://arxiv.org/abs/2012.09365

  33. Zhang, C., Rameau, F., Kim, J., Argaw, D.M., Bazin, J.C., Kweon, I.S.: DeepPTZ: Deep Self-Calibration for PTZ Cameras. In: 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1030–1038. IEEE, Snowmass Village, CO, USA (2020). 10.1109/WACV45572.2020.9093629

  34. Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Machine Intell. 22(11), 1330–1334 (2000). https://doi.org/10.1109/34.888718

    Article  Google Scholar 

  35. Zhang, Z.: Camera calibration with one-dimensional objects. IEEE Trans. Pattern Anal. Machine Intell. 26(7), 892–899 (2004). https://doi.org/10.1109/TPAMI.2004.21

    Article  Google Scholar 

  36. Zhu, C., Byrd, R.H., Lu, P., Nocedal, J.: Algorithm 778: L-BFGS-B: fortran subroutines for large-scale bound-constrained optimization. ACM Trans. Math. Softw. 23(4), 550–560 (1997). https://doi.org/10.1145/279232.279236

    Article  MathSciNet  MATH  Google Scholar 

Download references

Funding

The research in this paper is funded by National Natural Science Foundation of China (NSFC Nos. 51978271 and 61972160), Guangdong Basic and Applied Basic Research Foundation under Grant 2020A1515010699 and 2021A1515012301, Natural Science Fund of Guangdong Province under Grant 2019A1515011793 and 2021A1515011849 and the Fundamental Research Funds for the Central Universities (2020ZYGXZR042).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guiqing Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 Improvement of DeepCalib using distances

To use the distance information to improve the DeepCalib [3] results, we need to use DeepCalib to estimate the focal length first. We estimate the focal length of all 189 images using DeepCalib and use them as initial values of the focal length. In addition, to start the optimization process of Eq. (8), we need to perform the initialization of the depth z. Here we have two options: one is to estimate the depths by depth estimation methods, for example, MonoDepth2 [11] and scale them, and the other is to still optimize by initializing the depths randomly several times. In the first scheme, since MonoDepth2 only produces depths without truth scale information and its accuracy is still questionable, the second option is more likely to give suitable initial values of depth. Therefore, we randomly generate the initial depth values. By initializing CFLD in this way, we obtain the results in Fig. 11. We set the number of random generations to 5, that is, \(k=5\).

Figure 11 shows the results after optimization using the distance information. As shown in the left panel, the distance information does improve the estimation of the focal length over the entire dataset. However, in the right panel, not every result on every image is improved using distance information. Two reasons are causing this result. On the one hand, our method itself still suffers from defects related to the scale problem. On the other hand, the initial values of focal length and depth estimated by the other methods are not precise enough to make the optimized results fall at the proper minimum. After all, distance information improves the accuracy of the other methods in general, which demonstrates the improved meaning of distance information.

1.2 What factors influence the obtained performance?

To investigate the impact of camera quality on our method, we conducted a set of experiments on several major camera quality factors including lighting condition (this may be due to change of external lighting or internal exposure parameters), resolution and image distortion. In the end, we find that low resolution and distortion have a relatively strong impact on the accuracy of our algorithm.

1.2.1 Illumination

Illumination is a crucial aspect of many deep-learning based algorithms because it changes the RGB value of pixels. We took photos of the same scene under multiple lighting conditions from a fixed camera perspective and found that the lighting does not affect the results (see Fig. 12) while using the same marked points. It can infer that our algorithm is not affected by color changes, because the input of our algorithm does not involve the RGB value of pixels.

Fig. 12
figure 12

Estimated effect under different lighting conditions. The lighting has no impact on the estimation results

Fig. 13
figure 13

Estimated effect at different resolutions. When the resolution is higher, the result is better and stable

1.2.2 Resolution

Resolution is also an important factor impacting the performance of image processing algorithms. We use downsampling to obtain different resolution images while maintaining the consistency of the input marker points. In each downsampling, the coordinates of the marker points are simultaneously scaled by half but the physical distance between them remains unchanged. We performed 10 iterations of the experiment for 5 different resolutions. The algorithm is less effective and more unstable as the resolution getting lower as shown in Fig. 13.

1.2.3 Distortion

Fig. 14
figure 14

Estimated effect under different distortion conditions. When the distortion is large, our algorithm cannot accurately estimate the focal length

We use the fisheye function of the camera to simulate distortion. It is adjusted step by step through the given distortion degree inside the camera. Since our method does not model the camera distortion, it is unsuitable to deal with images with severe distortion (see Fig. 14). One of our future works is to extend the model to consider more complex situations including camera distortion.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiong, Y., Lin, Z., Li, G. et al. Camera focal length from distances in a single image. Vis Comput 37, 2869–2881 (2021). https://doi.org/10.1007/s00371-021-02233-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-021-02233-z

Keywords

Navigation