Skip to main content
Log in

Abstract

We present confocal stereo, a new method for computing 3D shape by controlling the focus and aperture of a lens. The method is specifically designed for reconstructing scenes with high geometric complexity or fine-scale texture. To achieve this, we introduce the confocal constancy property, which states that as the lens aperture varies, the pixel intensity of a visible in-focus scene point will vary in a scene-independent way, that can be predicted by prior radiometric lens calibration. The only requirement is that incoming radiance within the cone subtended by the largest aperture is nearly constant. First, we develop a detailed lens model that factors out the distortions in high resolution SLR cameras (12MP or more) with large-aperture lenses (e.g., f1.2). This allows us to assemble an A×F aperture-focus image (AFI) for each pixel, that collects the undistorted measurements over all A apertures and F focus settings. In the AFI representation, confocal constancy reduces to color comparisons within regions of the AFI, and leads to focus metrics that can be evaluated separately for each pixel. We propose two such metrics and present initial reconstruction results for complex scenes, as well as for a scene with known ground-truth shape.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Adelson, E. H., & Wang, J. Y. A. (1992). Single lens stereo with a plenoptic camera. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2), 99–106.

    Article  Google Scholar 

  • Asada, N., Fujiwara, H., & Matsuyama, T. (1998a). Edge and depth from focus. International Journal of Computer Vision, 26(2), 153–163.

    Article  Google Scholar 

  • Asada, N., Fujiwara, H., & Matsuyama, T. (1998b). Seeing behind the scene: Analysis of photometric properties of occluding edges by the reversed projection blurring model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(2), 155–167.

    Article  Google Scholar 

  • Baker, S., & Matthews, I. (2004). Lucas-Kanade 20 years on: A unifying framework. International Journal of Computer Vision, 56(3), 221–225.

    Article  Google Scholar 

  • Bertalmio, M., Sapiro, G., Caselles, V., & Ballester, C. (2000). Image inpainting. In Proc. ACM SIGGRAPH (pp. 417–424).

  • Bhasin, S. S., & Chaudhuri, S. (2001). Depth from defocus in presence of partial self occlusion. Proc. International Conference on Computer Vision, 2, 488–493.

    Google Scholar 

  • Bouguet, J.-Y. (2004). Camera calibration toolbox for Matlab (Oct. 14, 2004). http://vision.caltech.edu/bouguetj/calib_doc/.

  • Darrell, T., & Wohn, K. (1988). Pyramid based depth from focus. In Proc. computer vision and pattern recognition (pp. 504–509).

  • Debevec, P., & Malik, J. (1997). Recovering high dynamic range radiance maps from photographs. In Proc. ACM SIGGRAPH (pp. 369–378).

  • Farid, H., & Simoncelli, E. P. (1998). Range estimation by optical differentiation. Journal of the Optical Society of America A, 15(7), 1777–1786.

    Article  Google Scholar 

  • Favaro, P., & Soatto, S. (2002). Learning shape from defocus. Proc. European Conference on Computer Vision, 2, 735–745.

    Google Scholar 

  • Favaro, P., & Soatto, S. (2003). Seeing beyond occlusions (and other marvels of a finite lens aperture). In. Proc. Computer Vision and Pattern Recognition, 2, 579–586.

    Google Scholar 

  • Favaro, P., & Soatto, S. (2005). A geometric approach to shape from defocus. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(3).

  • Favaro, P., Mennucci, A., & Soatto, S. (2003a). Observing shape from defocused images. International Journal of Computer Vision, 52(1), 25–43.

    Article  MATH  Google Scholar 

  • Favaro, P., Osher, S., Soatto, S., & Vese, L. A. (2003b). 3D shape from anisotropic diffusion. Proc. Computer Vision and Pattern Recognition, 1, 179–186.

    Google Scholar 

  • Fitzgibbon, A., Wexler, Y., & Zisserman, A. (2005). Image-based rendering using image-based priors. International Journal of Computer Vision, 63(2), 141–151.

    Article  Google Scholar 

  • Fraser, C. S., & Shortis, M. R. (1992). Variation of distortion within the photographic field. Photogrammetric Engineering and Remote Sensing, 58(6), 851–855.

    Google Scholar 

  • Green, P., Sun, W., Matusik, W., & Durand, F. (2007). Multi-aperture photography. In Proc. ACM SIGGRAPH.

  • Grossberg, M. D., & Nayar, S. K. (2004). Modeling the space of camera response functions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(10), 1272–1282.

    Article  Google Scholar 

  • Hasinoff, S. W., & Kutulakos, K. N. (2007). A layer-based restoration framework for variable-aperture photography. In Proc. international conference on computer vision.

  • Healey, G. E., & Kondepudy, R. (1994). Radiometric CCD camera calibration and noise estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(3), 267–276.

    Article  Google Scholar 

  • Hertzmann, A., & Seitz, S. M. (2005). Example-based photometric stereo: Shape reconstruction with general, varying BRDFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), 1254–1264.

    Article  Google Scholar 

  • Isaksen, A., McMillan, L., & Gortler, S. J. (2000). Dynamically reparameterized light fields. In Proc. ACM SIGGRAPH (pp. 297–306).

  • Jin, H., & Favaro, P. (2002). A variational approach to shape from defocus. Proc. European Conference on Computer Vision, 2, 18–30.

    Google Scholar 

  • Kang, S. B., & Weiss, R. S. (2000). Can we calibrate a camera using an image of a flat, textureless Lambertian surface? Proc. European Conference on Computer Vision, 2, 640–653.

    Google Scholar 

  • Krotkov, E. (1987). Focusing. International Journal of Computer Vision, 1(3), 223–237.

    Article  Google Scholar 

  • Kubota, A., Takahashi, K., Aizawa, K., & Chen, T. (2004). All-focused light field rendering. In Proc. eurographics symposium on rendering.

  • Kutulakos, K. N., & Seitz, S. M. (2000). A theory of shape by shape carving. International Journal of Computer Vision, 38(3), 197–216.

    Article  Google Scholar 

  • Levin, A., Fergus, R., Durand, F., & Freeman, W. T. (2007). Image and depth from a conventional camera with a coded aperture. In Proc. ACM SIGGRAPH.

  • Levoy, M., & Hanrahan, P. (1996). Light field rendering. In Proc. ACM SIGGRAPH (pp. 31–42).

  • Levoy, M., Chen, B., Vaish, V., Horowitz, M., McDowall, I., & Bolas, M. T. (2004). Synthetic aperture confocal imaging. In Proc. ACM SIGGRAPH (pp. 825–834).

  • McGuire, M., Matusik, W., Pfister, H., Hughes, J. F., & Durand, F. (2005). Defocus video matting. In Proc. ACM SIGGRAPH (pp. 567–576).

  • Moreno-Noguer, F., Belhumeur, P. N., & Nayar, S. K. (2007). Active refocusing of images and videos. In Proc. ACM SIGGRAPH.

  • Nair, H., & Stewart, C. (1992). Robust focus ranging. In Proc. computer vision and pattern recognition (pp. 309–314).

  • Nayar, S., Watanabe, M., & Noguchi, M. (1996). Real-time focus range sensor. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(12), 1186–1198.

    Article  Google Scholar 

  • Ng, R. (2005). Fourier slice photography. In Proc. ACM SIGGRAPH (pp. 735–744).

  • Paris, S., Briceño, H., & Sillion, F. (2004). Capture of hair geometry from multiple images. In Proc. ACM SIGGRAPH (pp. 712–719).

  • Park, S. C., Park, M. K., & Kang, M. G. (2003). Super-resolution image reconstruction: a technical overview. IEEE Signal Processing Magazine, 20(3), 21–36.

    Article  Google Scholar 

  • Pentland, A. P. (1987). A new sense for depth of field. IEEE Transactions on Pattern Analysis and Machine Intelligence, 9(4), 523–531.

    Article  Google Scholar 

  • Rajagopalan, A. N., & Chaudhuri, S. (1999). An MRF model-based approach to simultaneous recovery of depth and restoration from defocused images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(7), 577–589.

    Article  Google Scholar 

  • Schechner, Y. Y., & Kiryati, N. (2000). Depth from defocus vs. stereo: How different really are they? International Journal of Computer Vision, 39(2), 141–162.

    Article  MATH  Google Scholar 

  • Smith, W. J. (2000). Modern Optical Engineering (3rd ed.) New York: McGraw-Hill.

    Google Scholar 

  • Subbarao, M., & Surya, G. (1994). Depth from defocus: A spatial domain approach. International Journal of Computer Vision, 13(3), 271–294.

    Article  Google Scholar 

  • Technical Innovations. http://www.robofocus.com/.

  • Vaish, V., Szeliski, R., Zitnick, C. L., & Kang, S. B. (2006). Reconstructing occluded surfaces using synthetic apertures: Stereo, focus and robust measures. In Proc. computer vision and pattern recognition (pp. 2331–2338).

  • Veeraraghavan, A., Raskar, R., Agrawal, A., Mohan, A., & Tumblin, J. (2007). Dappled photography: Mask enhanced cameras for heterodyned light fields and coded aperture refocusing. In Proc. ACM SIGGRAPH.

  • Watanabe, M., & Nayar, S. K. (1997). Telecentric optics for focus analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(12), 1360–1365.

    Article  Google Scholar 

  • Watanabe, M., & Nayar, S. K. (1998). Rational filters for passive depth from defocus. International Journal of Computer Vision, 27(3), 203–225.

    Article  Google Scholar 

  • Webb, R. H. (1996). Confocal optical microscopy. Reports on Progress in Physics, 59(3), 427–471.

    Article  MathSciNet  Google Scholar 

  • Wei, Y., Ofek, E., Quan, L., & Shum, H.-Y. (2005). Modeling hair from multiple views. In Proc. ACM SIGGRAPH (pp. 816–820).

  • Willson, R. (1994a). Modeling and calibration of automated zoom lenses. In Proc. SPIE #2350: Videometrics III (pp. 170–186).

  • Willson, R. (1994b). Modeling and calibration of automated zoom lenses. PhD thesis, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA.

  • Willson, R., & Shafer, S. (1994). What is the center of the image? Journal of the Optical Society of America A, 11(11), 2946–2955.

    Article  Google Scholar 

  • Xiong, Y., & Shafer, S. (1997). Moment and hypergeometric filters for high precision computation of focus, stereo and optical flow. International Journal of Computer Vision, 22(1), 25–59.

    Article  Google Scholar 

  • Zhang, L., & Nayar, S. K. (2006). Projection defocus analysis for scene capture and image display. In Proc. ACM SIGGRAPH (pp. 907–915).

  • Zitnick, C. L., Kang, S. B., Uyttendaele, M., Winder, S., & Szeliski, R. (2004). High-quality video view interpolation using a layered representation. In Proc. ACM SIGGRAPH (pp. 600–608).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samuel W. Hasinoff.

Additional information

Part of this work was done while the authors were visiting Microsoft Research Asia, in the roles of research intern and Visiting Scholar respectively.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hasinoff, S.W., Kutulakos, K.N. Confocal Stereo. Int J Comput Vis 81, 82–104 (2009). https://doi.org/10.1007/s11263-008-0164-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-008-0164-2

Keywords

Navigation