Confocal Stereo

Hasinoff, Samuel W.; Kutulakos, Kiriakos N.

doi:10.1007/s11263-008-0164-2

Samuel W. Hasinoff¹ &
Kiriakos N. Kutulakos¹

485 Accesses
42 Citations
Explore all metrics

Abstract

We present confocal stereo, a new method for computing 3D shape by controlling the focus and aperture of a lens. The method is specifically designed for reconstructing scenes with high geometric complexity or fine-scale texture. To achieve this, we introduce the confocal constancy property, which states that as the lens aperture varies, the pixel intensity of a visible in-focus scene point will vary in a scene-independent way, that can be predicted by prior radiometric lens calibration. The only requirement is that incoming radiance within the cone subtended by the largest aperture is nearly constant. First, we develop a detailed lens model that factors out the distortions in high resolution SLR cameras (12MP or more) with large-aperture lenses (e.g., f1.2). This allows us to assemble an A×F aperture-focus image (AFI) for each pixel, that collects the undistorted measurements over all A apertures and F focus settings. In the AFI representation, confocal constancy reduces to color comparisons within regions of the AFI, and leads to focus metrics that can be evaluated separately for each pixel. We propose two such metrics and present initial reconstruction results for complex scenes, as well as for a scene with known ground-truth shape.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Learning on Image Stitching With Multi-viewpoint Images: A Survey

Article 23 March 2023

Efficient single-pixel imaging based on a compact fiber laser array and untrained neural network

Article Open access 08 April 2024

See in 3D: state of the art of 3D display technologies

Article 15 October 2015

References

Adelson, E. H., & Wang, J. Y. A. (1992). Single lens stereo with a plenoptic camera. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2), 99–106.
Article Google Scholar
Asada, N., Fujiwara, H., & Matsuyama, T. (1998a). Edge and depth from focus. International Journal of Computer Vision, 26(2), 153–163.
Article Google Scholar
Asada, N., Fujiwara, H., & Matsuyama, T. (1998b). Seeing behind the scene: Analysis of photometric properties of occluding edges by the reversed projection blurring model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(2), 155–167.
Article Google Scholar
Baker, S., & Matthews, I. (2004). Lucas-Kanade 20 years on: A unifying framework. International Journal of Computer Vision, 56(3), 221–225.
Article Google Scholar
Bertalmio, M., Sapiro, G., Caselles, V., & Ballester, C. (2000). Image inpainting. In Proc. ACM SIGGRAPH (pp. 417–424).
Bhasin, S. S., & Chaudhuri, S. (2001). Depth from defocus in presence of partial self occlusion. Proc. International Conference on Computer Vision, 2, 488–493.
Google Scholar
Bouguet, J.-Y. (2004). Camera calibration toolbox for Matlab (Oct. 14, 2004). http://vision.caltech.edu/bouguetj/calib_doc/.
Darrell, T., & Wohn, K. (1988). Pyramid based depth from focus. In Proc. computer vision and pattern recognition (pp. 504–509).
Debevec, P., & Malik, J. (1997). Recovering high dynamic range radiance maps from photographs. In Proc. ACM SIGGRAPH (pp. 369–378).
Farid, H., & Simoncelli, E. P. (1998). Range estimation by optical differentiation. Journal of the Optical Society of America A, 15(7), 1777–1786.
Article Google Scholar
Favaro, P., & Soatto, S. (2002). Learning shape from defocus. Proc. European Conference on Computer Vision, 2, 735–745.
Google Scholar
Favaro, P., & Soatto, S. (2003). Seeing beyond occlusions (and other marvels of a finite lens aperture). In. Proc. Computer Vision and Pattern Recognition, 2, 579–586.
Google Scholar
Favaro, P., & Soatto, S. (2005). A geometric approach to shape from defocus. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(3).
Favaro, P., Mennucci, A., & Soatto, S. (2003a). Observing shape from defocused images. International Journal of Computer Vision, 52(1), 25–43.
Article MATH Google Scholar
Favaro, P., Osher, S., Soatto, S., & Vese, L. A. (2003b). 3D shape from anisotropic diffusion. Proc. Computer Vision and Pattern Recognition, 1, 179–186.
Google Scholar
Fitzgibbon, A., Wexler, Y., & Zisserman, A. (2005). Image-based rendering using image-based priors. International Journal of Computer Vision, 63(2), 141–151.
Article Google Scholar
Fraser, C. S., & Shortis, M. R. (1992). Variation of distortion within the photographic field. Photogrammetric Engineering and Remote Sensing, 58(6), 851–855.
Google Scholar
Green, P., Sun, W., Matusik, W., & Durand, F. (2007). Multi-aperture photography. In Proc. ACM SIGGRAPH.
Grossberg, M. D., & Nayar, S. K. (2004). Modeling the space of camera response functions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(10), 1272–1282.
Article Google Scholar
Hasinoff, S. W., & Kutulakos, K. N. (2007). A layer-based restoration framework for variable-aperture photography. In Proc. international conference on computer vision.
Healey, G. E., & Kondepudy, R. (1994). Radiometric CCD camera calibration and noise estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(3), 267–276.
Article Google Scholar
Hertzmann, A., & Seitz, S. M. (2005). Example-based photometric stereo: Shape reconstruction with general, varying BRDFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), 1254–1264.
Article Google Scholar
Isaksen, A., McMillan, L., & Gortler, S. J. (2000). Dynamically reparameterized light fields. In Proc. ACM SIGGRAPH (pp. 297–306).
Jin, H., & Favaro, P. (2002). A variational approach to shape from defocus. Proc. European Conference on Computer Vision, 2, 18–30.
Google Scholar
Kang, S. B., & Weiss, R. S. (2000). Can we calibrate a camera using an image of a flat, textureless Lambertian surface? Proc. European Conference on Computer Vision, 2, 640–653.
Google Scholar
Krotkov, E. (1987). Focusing. International Journal of Computer Vision, 1(3), 223–237.
Article Google Scholar
Kubota, A., Takahashi, K., Aizawa, K., & Chen, T. (2004). All-focused light field rendering. In Proc. eurographics symposium on rendering.
Kutulakos, K. N., & Seitz, S. M. (2000). A theory of shape by shape carving. International Journal of Computer Vision, 38(3), 197–216.
Article Google Scholar
Levin, A., Fergus, R., Durand, F., & Freeman, W. T. (2007). Image and depth from a conventional camera with a coded aperture. In Proc. ACM SIGGRAPH.
Levoy, M., & Hanrahan, P. (1996). Light field rendering. In Proc. ACM SIGGRAPH (pp. 31–42).
Levoy, M., Chen, B., Vaish, V., Horowitz, M., McDowall, I., & Bolas, M. T. (2004). Synthetic aperture confocal imaging. In Proc. ACM SIGGRAPH (pp. 825–834).
McGuire, M., Matusik, W., Pfister, H., Hughes, J. F., & Durand, F. (2005). Defocus video matting. In Proc. ACM SIGGRAPH (pp. 567–576).
Moreno-Noguer, F., Belhumeur, P. N., & Nayar, S. K. (2007). Active refocusing of images and videos. In Proc. ACM SIGGRAPH.
Nair, H., & Stewart, C. (1992). Robust focus ranging. In Proc. computer vision and pattern recognition (pp. 309–314).
Nayar, S., Watanabe, M., & Noguchi, M. (1996). Real-time focus range sensor. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(12), 1186–1198.
Article Google Scholar
Ng, R. (2005). Fourier slice photography. In Proc. ACM SIGGRAPH (pp. 735–744).
Paris, S., Briceño, H., & Sillion, F. (2004). Capture of hair geometry from multiple images. In Proc. ACM SIGGRAPH (pp. 712–719).
Park, S. C., Park, M. K., & Kang, M. G. (2003). Super-resolution image reconstruction: a technical overview. IEEE Signal Processing Magazine, 20(3), 21–36.
Article Google Scholar
Pentland, A. P. (1987). A new sense for depth of field. IEEE Transactions on Pattern Analysis and Machine Intelligence, 9(4), 523–531.
Article Google Scholar
Rajagopalan, A. N., & Chaudhuri, S. (1999). An MRF model-based approach to simultaneous recovery of depth and restoration from defocused images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(7), 577–589.
Article Google Scholar
Schechner, Y. Y., & Kiryati, N. (2000). Depth from defocus vs. stereo: How different really are they? International Journal of Computer Vision, 39(2), 141–162.
Article MATH Google Scholar
Smith, W. J. (2000). Modern Optical Engineering (3rd ed.) New York: McGraw-Hill.
Google Scholar
Subbarao, M., & Surya, G. (1994). Depth from defocus: A spatial domain approach. International Journal of Computer Vision, 13(3), 271–294.
Article Google Scholar
Technical Innovations. http://www.robofocus.com/.
Vaish, V., Szeliski, R., Zitnick, C. L., & Kang, S. B. (2006). Reconstructing occluded surfaces using synthetic apertures: Stereo, focus and robust measures. In Proc. computer vision and pattern recognition (pp. 2331–2338).
Veeraraghavan, A., Raskar, R., Agrawal, A., Mohan, A., & Tumblin, J. (2007). Dappled photography: Mask enhanced cameras for heterodyned light fields and coded aperture refocusing. In Proc. ACM SIGGRAPH.
Watanabe, M., & Nayar, S. K. (1997). Telecentric optics for focus analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(12), 1360–1365.
Article Google Scholar
Watanabe, M., & Nayar, S. K. (1998). Rational filters for passive depth from defocus. International Journal of Computer Vision, 27(3), 203–225.
Article Google Scholar
Webb, R. H. (1996). Confocal optical microscopy. Reports on Progress in Physics, 59(3), 427–471.
Article MathSciNet Google Scholar
Wei, Y., Ofek, E., Quan, L., & Shum, H.-Y. (2005). Modeling hair from multiple views. In Proc. ACM SIGGRAPH (pp. 816–820).
Willson, R. (1994a). Modeling and calibration of automated zoom lenses. In Proc. SPIE #2350: Videometrics III (pp. 170–186).
Willson, R. (1994b). Modeling and calibration of automated zoom lenses. PhD thesis, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA.
Willson, R., & Shafer, S. (1994). What is the center of the image? Journal of the Optical Society of America A, 11(11), 2946–2955.
Article Google Scholar
Xiong, Y., & Shafer, S. (1997). Moment and hypergeometric filters for high precision computation of focus, stereo and optical flow. International Journal of Computer Vision, 22(1), 25–59.
Article Google Scholar
Zhang, L., & Nayar, S. K. (2006). Projection defocus analysis for scene capture and image display. In Proc. ACM SIGGRAPH (pp. 907–915).
Zitnick, C. L., Kang, S. B., Uyttendaele, M., Winder, S., & Szeliski, R. (2004). High-quality video view interpolation using a layered representation. In Proc. ACM SIGGRAPH (pp. 600–608).

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Toronto, Toronto, ON, M5S 3G4, Canada
Samuel W. Hasinoff & Kiriakos N. Kutulakos

Authors

Samuel W. Hasinoff
View author publications
You can also search for this author in PubMed Google Scholar
Kiriakos N. Kutulakos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Samuel W. Hasinoff.

Additional information

Part of this work was done while the authors were visiting Microsoft Research Asia, in the roles of research intern and Visiting Scholar respectively.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hasinoff, S.W., Kutulakos, K.N. Confocal Stereo. Int J Comput Vis 81, 82–104 (2009). https://doi.org/10.1007/s11263-008-0164-2

Download citation

Received: 25 October 2006
Accepted: 29 July 2008
Published: 10 September 2008
Issue Date: January 2009
DOI: https://doi.org/10.1007/s11263-008-0164-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Confocal Stereo

Abstract

Access this article

Similar content being viewed by others

Deep Learning on Image Stitching With Multi-viewpoint Images: A Survey

Efficient single-pixel imaging based on a compact fiber laser array and untrained neural network

See in 3D: state of the art of 3D display technologies

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Confocal Stereo

Abstract

Access this article

Similar content being viewed by others

Deep Learning on Image Stitching With Multi-viewpoint Images: A Survey

Efficient single-pixel imaging based on a compact fiber laser array and untrained neural network

See in 3D: state of the art of 3D display technologies

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation