Camera focal length from distances in a single image

Xiong, Yunhui; Lin, Zuxuan; Li, Guiqing; Xian, Chuhua; Peng, Changxin

doi:10.1007/s00371-021-02233-z

Camera focal length from distances in a single image

Original article
Published: 17 July 2021

Volume 37, pages 2869–2881, (2021)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Yunhui Xiong¹,
Zuxuan Lin¹,
Guiqing Li²,
Chuhua Xian² &
…
Changxin Peng³

724 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Camera focal length estimation from a single image is of great importance for many computer vision tasks. Unfortunately, previous methods cannot achieve satisfactory accuracy yet. This paper proposes a focal length estimation approach based on distances among four points in the scene of a single image. The issue is cast into a nonlinear optimization by adapting a standard pinhole camera model under the constraints of distance information. Multiple algorithms are employed to solve the optimization, and the best solution is then regarded as the final solution. Experimental results show that our method is able to obtain a more accurate focal length than some state of the art in a single image setting. In addition, we provide some simple application examples and show the intuitive effects of focal length estimation errors. We also demonstrate experimentally that distance information has an improved meaning for the solution of the focal length.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image Fusion Techniques: A Survey

Article 24 January 2021

Deep Learning on Image Stitching With Multi-viewpoint Images: A Survey

Article 23 March 2023

Background-oriented schlieren (BOS) techniques

Article Open access 06 March 2015

References

Abbas, S.A., Zisserman, A.: A Geometric Approach to Obtain a Bird’s Eye View From an Image. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 4095–4104. IEEE, Seoul, Korea (South) (2019). 10.1109/ICCVW.2019.00504
Barreto, J.P.: A unifying geometric representation for central projection systems. Computer Vision Image Understand. 103(3), 208–217 (2006). https://doi.org/10.1016/j.cviu.2006.06.003
Article Google Scholar
Bogdan, O., Eckstein, V., Rameau, F., Bazin, J.C.: DeepCalib: A deep learning approach for automatic intrinsic calibration of wide field-of-view cameras. In: Proceedings of the 15th ACM SIGGRAPH European Conference on Visual Media Production, CVMP ’18, pp. 1–10. Association for Computing Machinery, London, United Kingdom (2018). 10.1145/3278471.3278479
Cao, Y.T., Wang, J.M., Sun, Y.K., Duan, X.J.: Circle Marker Based Distance Measurement Using a Single Camera. LNSE pp. 376–380 (2013). 10.7763/LNSE.2013.V1.80
Caprile, B., Torre, V.: Using vanishing points for camera calibration. Int. J. Comput. Vision 4(2), 127–139 (1990). https://doi.org/10.1007/BF00127813
Article Google Scholar
Chen, H.T.: Geometry-based camera calibration using five-point correspondences from a single image. IEEE Trans. Circuits Syst. Video Technol. 27(12), 2555–2566 (2017). https://doi.org/10.1109/TCSVT.2016.2595319
Article Google Scholar
Chen, Q., Wu, H., Wada, T.: Camera calibration with two arbitrary coplanar circles. In: Pajdla, T., Matas, J. (eds.) Computer vision - ECCV (2004). Springer, Berlin Heidelberg (2004)
Google Scholar
Conn, A.R., Gould, N.I.M., Toint, P.L.: Trust region methods. Soc. Indus. Appl. Math. (2000). https://doi.org/10.1137/1.9780898719857
Article MATH Google Scholar
Coughlan, J., Yuille, A.: Manhattan World: Compass direction from a single image by Bayesian inference. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 941–947. IEEE, Kerkyra, Greece (1999). 10.1109/ICCV.1999.790349
Deutscher, J., Isard, M., MacCormick, J.: Automatic camera calibration from a single Manhattan image. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) Computer vision – ECCV, 2002. Springer, Berlin Heidelberg (2002)
Google Scholar
Godard, C., Aodha, O.M., Firman, M., Brostow, G.: Digging Into Self-Supervised Monocular Depth Estimation. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3827–3837. IEEE, Seoul, Korea (South) (2019). 10.1109/ICCV.2019.00393
Gopalan, R., Li, R., Chellappa, R.: Domain adaptation for object recognition: An unsupervised approach. In: 2011 International Conference on Computer Vision, pp. 999–1006. IEEE, Barcelona, Spain (2011). 10.1109/ICCV.2011.6126344
Gordon, A., Li, H., Jonschkowski, R., Angelova, A.: Depth From Videos in the Wild: Unsupervised Monocular Depth Learning From Unknown Cameras. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8976–8985. IEEE, Seoul, Korea (South) (2019). 10.1109/ICCV.2019.00907
Hestenes, M., Stiefel, E.: Methods of conjugate gradients for solving linear systems. J. Res. Nat. Bur. Stan. 49(6), 409 (1952). https://doi.org/10.6028/jres.049.044
Article MathSciNet MATH Google Scholar
Hold-Geoffroy, Y., Sunkavalli, K., Eisenmann, J., Fisher, M., Gambaretto, E., Hadap, S., Lalonde, J.F.: A Perceptual Measure for Deep Single Image Camera Calibration. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2354–2363. IEEE, Salt Lake City, UT, USA (2018). 10.1109/CVPR.2018.00250
Jancosek, M., Pajdla, T.: Multi-view reconstruction preserving weakly-supported surfaces. In: CVPR 2011, pp. 3121–3128. IEEE, Colorado Springs, CO, USA (2011). 10.1109/CVPR.2011.5995693
Jiang, G., Quan, L.: Detection of concentric circles for camera calibration. In: Tenth IEEE International Conference on Computer Vision (ICCV’05), vol. 1, pp. 333–340. IEEE, Beijing, China (2005). 10.1109/ICCV.2005.73
Levenberg, K.: A method for the solution of certain non-linear problems in least squares. Quart. Appl. Math. 2(2), 164–168 (1944). https://doi.org/10.1090/qam/10666
Article MathSciNet MATH Google Scholar
Li, B., Peng, K., Ying, X., Zha, H.: Simultaneous vanishing point detection and camera calibration from single images. In: Bebis, G., Boyle, R., Parvin, B., Koracin, D., Chung, R., Hammound, R., Hussain, M., Kar-Han, T., Crawfis, R., Thalmann, D., Kao, D., Avila, L. (eds.) Advances in visual computing. Springer, Berlin Heidelberg (2010)
Google Scholar
Miyagawa, I., Arai, H., Koike, H.: Simple camera calibration from a single image using five points on two orthogonal 1-D objects. IEEE Trans. on Image Process. 19(6), 1528–1538 (2010). https://doi.org/10.1109/TIP.2010.2042118
Article MathSciNet MATH Google Scholar
Moulon, P., Monasse, P., Marlet, R.: Adaptive structure from motion with a contrario model estimation. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) Computer vision - ACCV. Springer, Berlin Heidelberg (2013)
Google Scholar
Ricolfe-Viala, C., Sánchez-Salmerón, A.J.: Robust metric calibration of non-linear camera lens distortion. Pattern Recogn. 43(4), 1688–1699 (2010). https://doi.org/10.1016/j.patcog.2009.10.003
Virtanen, P., et al.: SciPy 1.0 Fundamental algorithms for scientific computing in Python. Nat. Methods 14(3), 261–272 (2020). https://doi.org/10.1038/s41592-019-0686-2
Article Google Scholar
Song, W., Wang, Y., Li, H.X., Cai, Z.: Locating multiple optimal solutions of nonlinear equation systems based on multiobjective optimization. IEEE Trans. Evol. Computat. 19(3), 414–431 (2015). https://doi.org/10.1109/TEVC.2014.2336865
Article Google Scholar
Storn, R., Price, K.: Differential evolution - a simple and efficient heuristic for global optimization over continuous spaces. J. Global Optim. 11(4), 341–359 (1997). https://doi.org/10.1023/A:1008202821328
Article MathSciNet MATH Google Scholar
Torralba, Murphy, Freeman, Rubin: Context-based vision system for place and object recognition. In: Proceedings Ninth IEEE International Conference on Computer Vision, vol. 1, pp. 273–280. IEEE, Nice, France (2003). 10.1109/ICCV.2003.1238354
Tsai, R.: A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses. IEEE J. Robot. Automat. 3(4), 323–344 (1987). https://doi.org/10.1109/JRA.1987.1087109
Article Google Scholar
Wales, D.J., Doye, J.P.K.: Global optimization by basin-hopping and the lowest energy structures of lennard-jones clusters containing up to 110 atoms. J. Phys. Chem. A 101(28), 5111–5116 (1997). https://doi.org/10.1021/jp970984n
Article Google Scholar
Wildenauer, H., Hanbury, A.: Robust camera self-calibration from monocular images of Manhattan worlds. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2831–2838. IEEE, Providence, RI (2012). 10.1109/CVPR.2012.6248008
Workman, S., Greenwell, C., Zhai, M., Baltenberger, R., Jacobs, N.: DEEPFOCAL: A method for direct focal length estimation. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 1369–1373. IEEE, Quebec City, QC, Canada (2015). 10.1109/ICIP.2015.7351024
Yan, H., Zhang, Y., Zhang, S., Zhao, S., Zhang, L.: Focal length estimation guided with object distribution on FocaLens dataset. J. Electron Imag. 26(3), 018–033 (2017). https://doi.org/10.1117/1.JEI.26.3.033018
Article Google Scholar
Yin, W., Zhang, J., Wang, O., Niklaus, S., Mai, L., Chen, S., Shen, C.: Learning to Recover 3D Scene Shape from a Single Image. arXiv:2012.09365 [cs] (2020). http://arxiv.org/abs/2012.09365
Zhang, C., Rameau, F., Kim, J., Argaw, D.M., Bazin, J.C., Kweon, I.S.: DeepPTZ: Deep Self-Calibration for PTZ Cameras. In: 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1030–1038. IEEE, Snowmass Village, CO, USA (2020). 10.1109/WACV45572.2020.9093629
Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Machine Intell. 22(11), 1330–1334 (2000). https://doi.org/10.1109/34.888718
Article Google Scholar
Zhang, Z.: Camera calibration with one-dimensional objects. IEEE Trans. Pattern Anal. Machine Intell. 26(7), 892–899 (2004). https://doi.org/10.1109/TPAMI.2004.21
Article Google Scholar
Zhu, C., Byrd, R.H., Lu, P., Nocedal, J.: Algorithm 778: L-BFGS-B: fortran subroutines for large-scale bound-constrained optimization. ACM Trans. Math. Softw. 23(4), 550–560 (1997). https://doi.org/10.1145/279232.279236
Article MathSciNet MATH Google Scholar

Download references

Funding

The research in this paper is funded by National Natural Science Foundation of China (NSFC Nos. 51978271 and 61972160), Guangdong Basic and Applied Basic Research Foundation under Grant 2020A1515010699 and 2021A1515012301, Natural Science Fund of Guangdong Province under Grant 2019A1515011793 and 2021A1515011849 and the Fundamental Research Funds for the Central Universities (2020ZYGXZR042).

Author information

Authors and Affiliations

School of Mathematics, South China University of Technology, Guangzhou, 510006, China
Yunhui Xiong & Zuxuan Lin
School of Computer Science and Engineering, South China University of Technology, Guangzhou, 510006, China
Guiqing Li & Chuhua Xian
School of Architecture, South China University of Technology, Guangzhou, 510006, China
Changxin Peng

Authors

Yunhui Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Zuxuan Lin
View author publications
You can also search for this author in PubMed Google Scholar
Guiqing Li
View author publications
You can also search for this author in PubMed Google Scholar
Chuhua Xian
View author publications
You can also search for this author in PubMed Google Scholar
Changxin Peng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guiqing Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

1.1 Improvement of DeepCalib using distances

To use the distance information to improve the DeepCalib [3] results, we need to use DeepCalib to estimate the focal length first. We estimate the focal length of all 189 images using DeepCalib and use them as initial values of the focal length. In addition, to start the optimization process of Eq. (8), we need to perform the initialization of the depth z. Here we have two options: one is to estimate the depths by depth estimation methods, for example, MonoDepth2 [11] and scale them, and the other is to still optimize by initializing the depths randomly several times. In the first scheme, since MonoDepth2 only produces depths without truth scale information and its accuracy is still questionable, the second option is more likely to give suitable initial values of depth. Therefore, we randomly generate the initial depth values. By initializing CFLD in this way, we obtain the results in Fig. 11. We set the number of random generations to 5, that is, \(k=5\).

Figure 11 shows the results after optimization using the distance information. As shown in the left panel, the distance information does improve the estimation of the focal length over the entire dataset. However, in the right panel, not every result on every image is improved using distance information. Two reasons are causing this result. On the one hand, our method itself still suffers from defects related to the scale problem. On the other hand, the initial values of focal length and depth estimated by the other methods are not precise enough to make the optimized results fall at the proper minimum. After all, distance information improves the accuracy of the other methods in general, which demonstrates the improved meaning of distance information.

1.2 What factors influence the obtained performance?

To investigate the impact of camera quality on our method, we conducted a set of experiments on several major camera quality factors including lighting condition (this may be due to change of external lighting or internal exposure parameters), resolution and image distortion. In the end, we find that low resolution and distortion have a relatively strong impact on the accuracy of our algorithm.

1.2.1 Illumination

Illumination is a crucial aspect of many deep-learning based algorithms because it changes the RGB value of pixels. We took photos of the same scene under multiple lighting conditions from a fixed camera perspective and found that the lighting does not affect the results (see Fig. 12) while using the same marked points. It can infer that our algorithm is not affected by color changes, because the input of our algorithm does not involve the RGB value of pixels.

1.2.2 Resolution

Resolution is also an important factor impacting the performance of image processing algorithms. We use downsampling to obtain different resolution images while maintaining the consistency of the input marker points. In each downsampling, the coordinates of the marker points are simultaneously scaled by half but the physical distance between them remains unchanged. We performed 10 iterations of the experiment for 5 different resolutions. The algorithm is less effective and more unstable as the resolution getting lower as shown in Fig. 13.

1.2.3 Distortion

We use the fisheye function of the camera to simulate distortion. It is adjusted step by step through the given distortion degree inside the camera. Since our method does not model the camera distortion, it is unsuitable to deal with images with severe distortion (see Fig. 14). One of our future works is to extend the model to consider more complex situations including camera distortion.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiong, Y., Lin, Z., Li, G. et al. Camera focal length from distances in a single image. Vis Comput 37, 2869–2881 (2021). https://doi.org/10.1007/s00371-021-02233-z

Download citation

Accepted: 27 June 2021
Published: 17 July 2021
Issue Date: September 2021
DOI: https://doi.org/10.1007/s00371-021-02233-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Camera focal length from distances in a single image

Abstract

Access this article