Skip to main content
Log in

Combining Shape from Shading and Stereo: A Joint Variational Method for Estimating Depth, Illumination and Albedo

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Shape from shading (SfS) and stereo are two fundamentally different strategies for image-based 3-D reconstruction. While approaches for SfS infer the depth solely from pixel intensities, methods for stereo are based on a matching process that establishes correspondences across images. This difference in approaching the reconstruction problem yields complementary advantages that are worthwhile being combined. So far, however, most “joint” approaches are based on an initial stereo mesh that is subsequently refined using shading information. In this paper we follow a completely different approach. We propose a joint variational method that combines both cues within a single minimisation framework. To this end, we fuse a Lambertian SfS approach with a robust stereo model and supplement the resulting energy functional with a detail-preserving anisotropic second-order smoothness term. Moreover, we extend the resulting model in such a way that it jointly estimates depth, albedo and illumination. This in turn makes the approach applicable to objects with non-uniform albedo as well as to scenes with unknown illumination. Experiments for synthetic and real-world images demonstrate the benefits of our combined approach: They not only show that our method is capable of generating very detailed reconstructions, but also that joint approaches are feasible in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. Model by Ben Dansie (www.thingiverse.com/thing:144775).

  2. A 3D computer graphics software (www.blender.org).

References

  • Agarwal, S., Snavely, N., Simon, I., Seitz, S. M., & Szeliski, R. (2009). Building Rome in a day. In Proceedings of IEEE International Conference on Computer Vision (pp. 72–79).

  • Baillard, C., & Maître, H. (1999). 3-D reconstruction of urban scenes from aerial stereo imagery: A focusing strategy. Computer Vision and Image Understanding, 76(3), 244–258.

    Article  Google Scholar 

  • Basha, T., Moses, Y., & Kiryati, N. (2012). Multi-view scene flow estimation: A view centered variational approach. International Journal of Computer Vision, 101(1), 6–21.

    Article  MathSciNet  Google Scholar 

  • Black, M. J., & Anandan, P. (1991). Robust dynamic motion estimation over time. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 296–302).

  • Blake, A., Zisserman, A., & Knowles, G. (1985). Surface descriptions from stereo and shading. Image and Vision Computing, 3(4), 183–191.

    Article  Google Scholar 

  • Bleyer, M., Rhemann, C., & Rother, C. (2011). PatchMatch stereo: Stereo matching with slanted support windows. In Proceedings of British Machine Vision Conference (pp. 14:1–14:11).

  • Boulanger, J., Elbau, P., Pontow, C., & Scherzer, O. (2011). Non-local functionals for imaging. In Fixed-point algorithms for inverse problems in science and engineering, Springer Optimization and Its Applications (pp. 131–154). Springer. https://doi.org/10.1007/978-1-4419-9569-8_8

    Google Scholar 

  • Bredies, K., Kunisch, K., & Pock, T. (2010). Total generalized variation. SIAM Journal on Imaging Sciences, 3(3), 492–526.

    Article  MathSciNet  Google Scholar 

  • Breuß, M., Cristiani, E., Dorou, J.-D., Falcone, M., & Vogel, O. (2012). Perspective shape from shading: Ambiguity analysis and numerical approximations. SIAM Journal on Imaging Science, 5(1), 311–342.

    Article  MathSciNet  Google Scholar 

  • Brox, T., Bruhn, A., Papenberg, N., & Weickert, J. (2004). High accuracy optical flow estimation based on a theory for warping. In Proceedings of European Conference on Computer Vision (pp. 25–36).

    Google Scholar 

  • Bruhn, A., & Weickert, J. (2005). Towards ultimate motion estimation: Combining highest accuracy with real-time performance. In Proceedings of IEEE International Conference on Computer Vision (pp. 749–755).

  • Camilli, F., & Tozza, S. (2017). A unified approach to the well-posedness of some non-Lambertian models in shape-from-shading theory. SIAM Journal on Imaging Science, 10(1), 26–46.

    Article  MathSciNet  Google Scholar 

  • Charbonnier, P., Blanc-Féraud, L., Aubert, G., & Barlaud, M. (1997). Deterministic edge-preserving regularization in computed imaging. IEEE Transactions on Image Processing, 6(2), 298–311.

    Article  Google Scholar 

  • Chen, Q., & Koltun, V. (2013). A simple model for intrinsic image decomposition with depth cues. In Proceedings of IEEE International Conference on Computer Vision (pp. 241–248).

  • Cryer, J. E., Tsai, P.-S., & Shah, M. (1995). Integration of shape from shading and stereo. Pattern Recognition, 28(7), 1033–1043.

    Article  Google Scholar 

  • Di Zenzo, S. (1986). A note on the gradient of a multi-image. Computer Vision, Graphics and Image Processing, 33, 116–125.

    Article  Google Scholar 

  • Förstner, W., & Gülch, E. (1987). A fast operator for detection and precise location of distinct points, corners and centres of circular features. In Proceedings of ISPRS Intercommission Conference on Fast Processing of Photogrammetric Data (pp. 281–305).

  • Fua, P., & Leclerc, Y. G. (1995). Object-centered surface reconstruction: Combining multi-image stereo and shading. International Journal of Computer Vision, 16(1), 35–56.

    Article  Google Scholar 

  • Galliani, S., Lasinger, K., & Schindler, K. (2015). Massively parallel multiview stereopsis by surface normal diffusion. In Proceedings of IEEE International Conference on Computer Vision (pp. 873–881).

  • Gelfand, I. M., & Fomin, S. V. (2000). Calculus of variations. New York: Dover.

    MATH  Google Scholar 

  • Graber, G., Balzer, J., Soatto, S., & Pock, T. (2015). Efficient minimal-surface regularization of perspective depth maps in variational stereo. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 511–520).

  • Hafner, D., Schroers, C., & Weickert, J. (2015). Introducing maximal anisotropy into second order coupling models. In Proceedings of German Conference on Pattern Recognition (pp. 79–90).

    Google Scholar 

  • Haines, T. S. F., & Wilson, R. C. (2007). Integrating stereo with shape-from-shading derived orientation information. In Proceedings of British Machine Vision Conference (pp. 84:1–84:10).

  • Han, Y., Lee, J.-Y., & Kweon, I. S. (2013). High quality shape from a single RGB-D image under uncalibrated natural illumination. In Proceedings of IEEE International Conference on Computer Vision (pp. 1617–1624).

  • Hirschmüller, H. (2008). Stereo processing by semi-global matching and mutual information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2), 328–341.

    Article  Google Scholar 

  • Horn, B. (1970). Shape from shading: A method for obtaining the shape of a smooth opaque object from one view. Ph.D. thesis, Massachusetts Institute of Technology.

  • Hougen, D. R., & Ahuja, N. (1994). Adaptive polynomial modelling of the reflectance map for shape estimation from stereo and shading. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 991–994).

  • Immel, D. S., Cohen, M. F., & Greenberg, D. P. (1986). A radiosity method for non-diffuse environments. In Proceedings of SIGGRAPH (pp. 133–142).

    Article  Google Scholar 

  • Jin, H., Cremers, D., Wang, D., Prados, E., Yezzi, A., & Soatto, S. (2008). 3-D reconstruction of shaded objects from multiple images under unknown illumination. International Journal of Computer Vision, 76(3), 245–256.

    Article  Google Scholar 

  • Ju, Y. C., Maurer, D., Breuß, M., & Bruhn, A. (2016). Direct variational perspective shape from shading with Cartesian depth parametrisation. In Perspectives in shape analysis, Mathematics and visualization (pp. 43–72). Springer. https://doi.org/10.1007/978-3-319-24726-7_3

    Google Scholar 

  • Kajiya, J. T. (1986). The rendering equation. In Proceedings of SIGGRAPH (pp. 143–150).

    Article  Google Scholar 

  • Langguth, F., Sunkavalli, K., Hadap, S., & Goesele, M. (2016). Shading-aware multi-view stereo. In Proceedings of European Conference on Computer Vision (pp. 469–485).

    Chapter  Google Scholar 

  • Leclerc, Y. G., & Bobick, A. (1991). The direct computation of height from shading. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 552–558).

  • Liu-Yin, Q., Yu, R., Fitzgibbon, A., Agapito, L., & Russell, C. (2016). Better together: Joint reasoning for non-rigid 3D reconstruction with specularities and shading. In Proceedings of British Machine Vision Conference (pp. 42:1–42:12).

  • Marr, D., & Poggio, T. (1976). Cooperative computation of stereo disparity. Science, 194, 283–287.

    Article  Google Scholar 

  • Maurer, D., Ju, Y. C., Breuß, M., & Bruhn, A. (2015). An efficient linearisation approach for variational perspective shape from shading. In Proceedings of German Conference on Pattern Recognition (pp. 249–261).

    Google Scholar 

  • Maurer, D., Ju, Y. C., Breuß, M., & Bruhn, A. (2016). Combining shape from shading and stereo: A variational approach for the joint estimation of depth, illumination and albedo. In Proceedings of British Machine Vision Conference (pp. 76:1–76:14).

  • Mecca, R., Quau, Y., Logothetis, F., & Cipolla, R. (2016). A single-lobe photometric stereo approach for heterogeneous material. SIAM Journal on Imaging Science, 9(4), 1858–1888.

    Article  MathSciNet  Google Scholar 

  • Mémin, E., & Pérez, P. (1998). A multigrid approach for hierarchical motion estimation. In Proceedings of IEEE International Conference on Computer Vision (pp. 933–938).

  • Newcombe, R. A., Lovegrove, S. J., & Davison, A. J. (2011). DTAM: Dense tracking and mapping in real-time. In Proceedings of IEEE International Conference on Computer Vision (pp. 2320–2327).

  • Okatani, T., & Deguchi, K. (1997). Shape reconstruction from an endoscope image by shape from shading technique for a point light source at the projection center. Computer Vision and Image Understanding, 66(2), 119–131.

    Article  Google Scholar 

  • Or-El, R., Hershkovitz, R., Wetzler, A., Rosman, G., Bruckstein, A. M., & Kimmel, R. (2016). Real-time depth refinement for specular objects. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 4378–4386).

  • Perona, P., & Malik, J. (1990). Scale space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2, 629–639.

    Article  Google Scholar 

  • Phong, B. T. (1975). Illumination for computer generated pictures. Communications of the ACM, 18(6), 311–317.

    Article  Google Scholar 

  • Prados, E., & Faugeras, O. (2005). Shape from shading: A well-posed problem? In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 870–877).

  • Quéau, Y., Lauze, F., & Durou, J.-D. (2015). A \(L^1\)-TV algorithm for robust perspective photometric stereo with spatially-varying lightings. In Proceedings of International Conference on Scale Space and Variational Methods in Computer Vision (pp. 498–510).

    Google Scholar 

  • Ranftl, R., Gehrig, S., Pock, T., & Bischof, H. (2012). Pushing the limits of stereo using variational stereo estimation. In Proceedings of IEEE Intelligent Vehicles Symposium (pp. 401–407).

  • Robert, L., & Deriche, R. (1996). Dense depth map reconstruction: A minimization and regularization approach which preserves discontinuities. In Proceedings of European Conference on Computer Vision (pp. 439–451).

    Google Scholar 

  • Rouy, E., & Tourin, A. (1992). A viscosity solutions approach to shape-from-shading. SIAM Journal on Numerical Analysis, 29, 867–884.

    Article  MathSciNet  Google Scholar 

  • Samaras, D., Metaxas, D., Fua, P., & Leclerc, Y. G. (2000). Variable Albedo surface reconstruction from stereo and shape from shading. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 480–487).

  • Schroers, C., Hafner, D., & Weickert, J. (2015). Multiview depth parameterisation with second order regularisation. In Proceedings of International Conference on Scale Space and Variational Methods in Computer Vision (pp. 551–562).

    Google Scholar 

  • Semerjian, B. (2014). A new variational framework for multiview surface reconstruction. In Proceedings of European Conference on Computer Vision (pp. 719–734).

    Google Scholar 

  • Simoncelli, E. P., Adelson, E. H., & Heeger, D. J. (1991). Probability distributions of optical flow. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 310–315).

  • Strecha, C., von Hansen, W., Van Gool, L., Fua, P., & Thoennessen, U. (2008) On benchmarking camera calibration and multi-view stereo for high resolution imagery. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–8).

  • Valgaerts, L., Wu, C., Bruhn, A., Seidel, H.-P., & Theobalt, C. (2012). Lightweight binocular facial performance capture under uncontrolled lighting. ACM Transactions on Graphics, 31(6), 1–11.

    Article  Google Scholar 

  • van Diggelen, J. (1951). A photometric investigation of the slopes and the heights of the ranges of hills in the Maria of the Moon. Bulletin of the Astronomical Institutes of the Netherlands, 1, 283–289.

    Google Scholar 

  • Vogel, O., Bruhn, A., Weickert, J., & Didas, S. (2007). Direct shape-from-shading with adaptive higher order regularisation. In Proceedings of International Conference on Scale Space and Variational Methods in Computer Vision (pp. 871–882).

  • Volz, S., Bruhn, A., Valgaerts, L., & Zimmer, H. (2011). Modeling temporal coherence for optical flow. In Proceedings of IEEE International Conference on Computer Vision (pp. 1116–1123).

  • Weickert, J. (1998). Anisotropic diffusion in image processing. Stuttgart: Teubner.

    MATH  Google Scholar 

  • Weickert, J., Welk, M., & Wickert, M. (2013). \(L^2\)-stable nonstandard finite differences for anisotropic diffusion. In Proceedings of International Conference on Scale Space and Variational Methods in Computer Vision (pp. 380–391).

  • Wu, C., Narasimhan, S., & Jaramaz, B. (2010). A multi-image shape-from-shading framework for near-lighting perspective endoscopes. International Journal of Computer Vision, 86, 211–228.

    Article  MathSciNet  Google Scholar 

  • Wu, C., Wilburn, B., Matsushita, Y., & Theobalt, C. (2011). High-quality shape from multi-view stereo and shading under general illumination. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 969–976).

  • Wu, C., Zollhöfer, M., Nießner, M., Stamminger, M., Izadi, S., & Theobalt, C. (2014). Real-time shading-based refinement for consumer depth cameras. ACM Transactions on Graphics, 33(6), 200:1–200:10.

    MATH  Google Scholar 

  • Xu, D., Duan, Q., Zheng, J., Zhang, J., Cai, J., & Cham, T.-J. (2014). Recovering surface details under general unknown illumination using shading and coarse multi-view stereo. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1526–1533).

  • Yoon, K.-J., Prados, E., & Sturm, P. (2010). Joint estimation of shape and reflectance using multiple images with known illumination conditions. International Journal of Computer Vision, 86(2–3), 192–210.

    Article  MathSciNet  Google Scholar 

  • Young, D. M. (1971). Iterative solution of large linear systems. New York: Academic Press.

    Google Scholar 

  • Yu, L. F., Yeung, S. K., Tai, Y. W., & Lin, S. (2013). Shading-based shape refinement of RGB-D images. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1415–1422).

  • Yu, T., Xu, N., & Ahuja, N. (2004). Recovering shape and reflectance model of non-Lambertian objects from multiple views. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 226–233).

  • Yu, T., Xu, N., & Ahuja, N. (2007). Shape and view independent reflectance map from multiple views. International Journal of Computer Vision, 73(2), 123–138.

    Article  Google Scholar 

  • Zimmer, H., Bruhn, A., & Weickert, J. (2011). Optic flow in harmony. International Journal of Computer Vision, 93(3), 368–388.

    Article  MathSciNet  Google Scholar 

  • Zollhöfer, M., Dai, A., Innmann, M., Wu, C., Stamminger, M., Theobalt, C., et al. (2015). Shading-based refinement on volumetric signed distance functions. ACM Transactions on Graphics, 34(4), 96:1–96:14.

    Article  Google Scholar 

Download references

Acknowledgements

This work has been partly funded by the German Research Foundation (DFG) within the joint Project BR 2245/3-1 and BR 4372/1-1. Moreover, we thank the DFG for financial support within Project B04 of SFB/Transregio 161.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Maurer.

Additional information

Communicated by Edwin Hancock, Richard Wilson, Will Smith, Adrian Bors and Nick Pears.

Appendices

Appendix A: Pixel-Based Versus Image-Based Coordinates

1.1 Image Coordinates

In the SfS literature the surface is often parametrised in terms of image coordinates and it is typically assumed to lie at negative Z-coordinates; see e.g. Prados and Faugeras (2005). If we denote the image coordinates by \(\hat{\mathbf {x}}_0=(\hat{x}_0,\hat{y}_0)^\top \), such a parametrisation is for instance given by Ju et al. (2016); Maurer et al. (2016):

$$\begin{aligned} \mathcal {S} \left( \hat{\mathbf {x}}_0 \right) = \left( \begin{array}{c} \dfrac{z \; \hat{x}_0}{\mathtt {f}_0} \\ \dfrac{z \; \hat{y}_0}{\mathtt {f}_0} \\ - z \end{array} \right) \; . \end{aligned}$$
(58)

The resulting surface normal then reads

$$\begin{aligned} \tilde{\mathbf {n}}(\hat{\mathbf {x}}_0)= & {} \mathcal {S}_{x}(\hat{\mathbf {x}}_0) \times \mathcal {S}_{y}(\hat{\mathbf {x}}_0) = \frac{z}{\mathtt {f}_0^2} \left( \begin{array}{c} \mathtt {f}_0 \; \hat{\nabla } z \\ \big (\hat{\nabla } z^\top \hat{\mathbf {x}}_0\big ) + z \end{array} \right) , \end{aligned}$$
(59)

where \(\hat{\nabla }=(\partial _{\hat{x}_0},\partial _{\hat{y}_0})^\top \) denotes the gradient operator in image coordinates. Please note that this surface normal points outwards, i.e. away from the object.

1.2 Pixel Coordinates

In contrast, in our case, the parametrisation of the surface is given in pixel coordinates and the surface is assumed to lie at positive Z-coordinates; see also Graber et al. (2015). The corresponding surface parametrisation then reads

$$\begin{aligned} \mathcal {S} \left( \mathbf {x}_0 \right) = \left( \begin{array}{c} \dfrac{z \; h_{0,x} \; (x_0 - c_{0,x})}{\mathtt {f}_0} \\ \dfrac{z \; h_{0,y} \; (y_0 - c_{0,y})}{\mathtt {f}_0} \\ z \end{array} \right) \; , \end{aligned}$$
(60)

which yields the surface normal

$$\begin{aligned} \tilde{\mathbf {n}}(\mathbf {x}_0)= & {} \mathcal {S}_{x}(\mathbf {x}_0) \times \mathcal {S}_{y}(\mathbf {x}_0)\\ \nonumber= & {} z \; \frac{h_{0,x} \; h_{0,y}}{\mathtt {f}_0^2} \! \left( \begin{array}{c} - \mathtt {f}_0 \; \dfrac{z_x}{h_{0,x}} \\ - \mathtt {f}_0 \; \dfrac{z_y}{h_{0,y}} \\ \nabla z^\top \left( \mathbf {x}_0 \! - \! \mathbf {c}_0\right) + z \end{array} \right) \\ \nonumber= & {} h_{0,x} \; h_{0,y} \; \frac{z}{\mathtt {f}_0^2} \! \left( \begin{array}{c} - \mathtt {f}_0 \; H^{-1}_{0} \; \nabla z \\ \left( H^{-1}_{0} \nabla z \right) ^\top H_{0} \left( \mathbf {x}_0 \! - \! \mathbf {c}_0\right) + z \end{array} \right) , \end{aligned}$$
(61)

where

$$\begin{aligned} H_{0} = \left( \begin{array}{cc} h_{0,x} &{} 0 \\ 0 &{} h_{0,y} \end{array} \right) \end{aligned}$$
(62)

is a diagonal matrix involving the pixel sizes \(h_{0,x}\) and \(h_{0,y}\). Please note that due to the positive sign in the third component of the parametrisation this normal is now pointing inwards, i.e. into the object. Inverting the direction hence yields the desired outward normal

$$\begin{aligned} - \tilde{\mathbf {n}}(\mathbf {x}_0)= & {} -\mathcal {S}_{x}(\mathbf {x}_0) \times \mathcal {S}_{y}(\mathbf {x}_0)\nonumber \\= & {} h_{0,x} \; h_{0,y} \; \frac{z}{\mathtt {f}_0^2}\nonumber \\&\cdot \,\left( \begin{array}{c} \mathtt {f}_0 \; H^{-1}_{0} \; \nabla z \\ -\left( H^{-1}_{0} \nabla z \right) ^\top H_{0} \left( \mathbf {x}_0 \! - \! \mathbf {c}_0\right) - z \end{array} \right) . \end{aligned}$$
(63)

1.3 Comparison

Let us now compare the surface normals in Eqs. (59) and (63). On the one hand, one can observe a difference in the sign of the third component. Since the sign of the third component of the illumination vector field changes according to the sign of the third component of the surface parametrisation, the result of the inner product between surface normal and light direction in Eq. (11) is the same for both parametrisations. On the other hand, one can rewrite the image coordinates \(\hat{\mathbf {x}}_0\) and the associated depth gradient \(\hat{\nabla } z\) in terms of pixel coordinates. It is easy to verify that the corresponding transforms are given by \(\hat{\mathbf {x}}_0 \! = \! H_{0} \! \left( \mathbf {x}_0 \! - \! \mathbf {c}_0\right) \) and \(\hat{\nabla } z \! = \! H_{0}^{-1} \nabla z\), respectively. Finally, ignoring scale—we are only interested in the direction—one can see that the corresponding unit surface normals are equivalent.

Appendix B: Derivation of the Euler–Lagrange Equations

In order to derive the Euler–Lagrange equations (EL 1)–(EL 4) of the differential energy (25) one must compute its first variation, i.e. the Gâteaux differential. Due to the sum rule one may split the derivation into the contributions of the individual terms in Eqs. (35), (37), (38), (39), (40) and (43). While the derivation of most contributions is straightforward (Gelfand and Fomin 2000), the contributions related to the differential SfS data term in Eq. (37) are more complicated due to the fact that it contains non-local contributions. However, in contrast to typical non-local approaches that rely on patch-based strategies, e.g. Boulanger et al. (2011), the non-local contributions in our case originate from the upwind scheme approximation employed in the SfS term.

Fig. 14
figure 14

Illustration of the rearranging step (a) performed in Eq. (69).

Hence, in the following, we focus on the derivation of those contributions to the Euler–Lagrange equations that are related to the SfS data term. Therefore, we first compute the Gâteaux differential of \(D_{\mathsf {sfs}}^k\) at \(( dz^k, \mathbf {dl}^k, \mathbf {d}\varvec{\rho }^k )\) in the direction \(\mathbf {v} = \left( v_1,\mathbf {v}_2,\mathbf {v}_3\right) ^\top \), as carried out in Eq. (69). While the first steps are straightforward, the rearrangement of the terms in step (a) is slightly more difficult. In order to factor out the directions \(v_1\) evaluated at position \(\mathbf {x}_0\) we must account for the additional non-local contributions of the neighbouring positions, i.e. the non-local contributions of \(\mathbf {x}_0 - \mathbf {h}_x^k\), \(\mathbf {x}_0 + \mathbf {h}_x^k\), \(\mathbf {x}_0 - \mathbf {h}_y^k\) and \(\mathbf {x}_0 + \mathbf {h}_y^k\), as illustrated in Fig. 14. Please note that for reasons of simplicity we neglect the fact that we do not have non-local contributions outside the image domain \(\varOmega \), and hence would have a reduced set of neighbours near the boundary. In the last step (b), we can finally identify the coefficients of the directions \(v_1\), \(\mathbf {v}_2\) and \(\mathbf {v}_3\) with the functional derivatives \(\frac{\delta D_{\mathsf {sfs}}^k}{\delta dz^k}\), \(\frac{\delta D_{\mathsf {sfs}}^k}{\delta \mathbf {dl}^k }\) and \(\frac{\delta D_{\mathsf {sfs}}^k}{\delta \mathbf {d}\varvec{\rho }^k }\) which serve as abbreviations.

Since \(\delta D_{\mathsf {sfs}}^k\) must vanish at extrema for any direction including \(\mathbf {v} = \left( v_1, \mathbf {0}, \mathbf {0}\right) ^\top \), \(\mathbf {v} = \left( 0, \mathbf {v}_2, \mathbf {0}\right) ^\top \), and \(\mathbf {v} = \left( 0, \mathbf {0}, \mathbf {v}_3\right) ^\top \), we obtain the following necessary conditions, i.e. Euler–Lagrange equations:

$$\begin{aligned} \frac{\delta D_{\mathsf {sfs}}^k}{\delta dz^k} = 0 , \quad \frac{\delta D_{\mathsf {sfs}}^k}{\delta \mathbf {dl}^k } = 0 , \quad \frac{\delta D_{\mathsf {sfs}}^k}{\delta \mathbf {d}\varvec{\rho }^k } = 0 . \end{aligned}$$
(64)

Please note that the other terms of the differential energy give additional contributions to the Euler–Lagrange equations. They can be computed in the same way as before. Hence, finally we obtain:

$$\begin{aligned} 0 = \frac{\delta E^k}{\delta dz^k} =&\bigg ( \frac{\delta D^k_{\mathsf {stereo}}}{\delta dz^k} + \nu \cdot \frac{\delta D^k_{\mathsf {sfs}} }{\delta dz^k} + \alpha _{z} \cdot \frac{\delta R^k_{\mathsf {depth}} }{\delta dz^k} \nonumber \\&+ \alpha _{\mathbf {l}} \cdot \frac{\delta R^k_{\mathsf {illum}} }{\delta dz^k} + \alpha _{\varvec{\rho }} \cdot \frac{\delta R^k_{\mathsf {albedo}}}{\delta dz^k} + \alpha _{\mathsf {inc}} \cdot \frac{\delta R^k_{\mathsf {inc}} }{\delta dz^k}\bigg ) , \end{aligned}$$
(65)
$$\begin{aligned} 0 = \frac{\delta E^k}{\delta \mathbf {dl}^k } =&\bigg ( \frac{\delta D^k_{\mathsf {stereo}}}{\delta \mathbf {dl}^k } + \nu \cdot \frac{\delta D^k_{\mathsf {sfs}} }{\delta \mathbf {dl}^k } + \alpha _{z} \cdot \frac{\delta R^k_{\mathsf {depth}} }{\delta \mathbf {dl}^k } \nonumber \\&+ \alpha _{\mathbf {l}} \cdot \frac{\delta R^k_{\mathsf {illum}} }{\delta \mathbf {dl}^k } + \alpha _{\varvec{\rho }} \cdot \frac{\delta R^k_{\mathsf {albedo}}}{\delta \mathbf {dl}^k } + \alpha _{\mathsf {inc}} \cdot \frac{\delta R^k_{\mathsf {inc}} }{\delta \mathbf {dl}^k }\bigg ) , \end{aligned}$$
(66)
$$\begin{aligned} 0 = \frac{\delta E^k}{\delta \mathbf {d}\varvec{\rho }^k } =&\bigg ( \frac{\delta D^k_{\mathsf {stereo}}}{\delta \mathbf {d}\varvec{\rho }^k} + \nu \cdot \frac{\delta D^k_{\mathsf {sfs}} }{\delta \mathbf {d}\varvec{\rho }^k} + \alpha _{z} \cdot \frac{\delta R^k_{\mathsf {depth}} }{\delta \mathbf {d}\varvec{\rho }^k} \nonumber \\&+ \alpha _{\mathbf {l}} \cdot \frac{\delta R^k_{\mathsf {illum}} }{\delta \mathbf {d}\varvec{\rho }^k} + \alpha _{\varvec{\rho }} \cdot \frac{\delta R^k_{\mathsf {albedo}}}{\delta \mathbf {d}\varvec{\rho }^k} + \alpha _{\mathsf {inc}} \cdot \frac{\delta R^k_{\mathsf {inc}} }{\delta \mathbf {d}\varvec{\rho }^k} \bigg ) , \end{aligned}$$
(67)
$$\begin{aligned} 0 = \frac{\delta E^k}{\delta \mathbf {du}^k } =&\bigg ( \frac{\delta D^k_{\mathsf {stereo}}}{\delta \mathbf {du}^k} + \nu \cdot \frac{\delta D^k_{\mathsf {sfs}} }{\delta \mathbf {du}^k} + \alpha _{z} \cdot \frac{\delta R^k_{\mathsf {depth}} }{\delta \mathbf {du}^k} \nonumber \\&+ \alpha _{\mathbf {l}} \cdot \frac{\delta R^k_{\mathsf {illum}} }{\delta \mathbf {du}^k} + \alpha _{\varvec{\rho }} \cdot \frac{\delta R^k_{\mathsf {albedo}}}{\delta \mathbf {du}^k} + \alpha _{\mathsf {inc}} \cdot \frac{\delta R^k_{\mathsf {inc}} }{\delta \mathbf {du}^k} \bigg ) . \end{aligned}$$
(68)

which then results in the Euler–Lagrange equations (EL 1)–(EL 4).

$$\begin{aligned}&\delta D_{\mathsf {sfs}}^k \left( dz^k, \mathbf {dl}^k, \mathbf {d}\varvec{\rho }^k; v_1,\mathbf {v}_2,\mathbf {v}_3 \right) = \lim \limits _{t \rightarrow 0} \frac{D_{\mathsf {sfs}}^k \left( dz^k + t \cdot v_1, \mathbf {dl}^k + t \cdot \mathbf {v}_2, \mathbf {d}\varvec{\rho }^k + t \cdot \mathbf {v}_3 \right) - D_{\mathsf {sfs}}^k \left( dz^k, \mathbf {dl}^k, \mathbf {d}\varvec{\rho }^k \right) }{t}\\&\quad = \frac{d}{dt} D_{\mathsf {sfs}}^k \left( dz^k + t \cdot v_1, \mathbf {dl}^k + t \cdot \mathbf {v}_2, \mathbf {d}\varvec{\rho }^k + t \cdot \mathbf {v}_3 \right) \biggr |_{t = 0}\\&\quad = 2 \cdot \! \! \int _{\varOmega _{0}} \sum _{c=1}^{3} \Bigg \{ \Bigg [ {\phi }^{k,c}(\mathbf {x}_0) + \sum _{\mathbf {h}^k \in H^k} \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial z^k(\mathbf {x}_0 \! + \! \mathbf {h}^k)} \big ( dz^{k}(\mathbf {x}_0 \! + \! \mathbf {h}^k) + t \cdot v_1(\mathbf {x}_0 \! + \! \mathbf {h}^k) \big ))\\&\qquad +\,\left( \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial \mathbf {l}^k(\mathbf {x}_0)} \right) ^\top \left( \mathbf {dl}^k(\mathbf {x}_0) + t \cdot \mathbf {v}_2(\mathbf {x}_0) \right) + \left( \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial \varvec{\rho }^k(\mathbf {x}_0)}\right) ^\top \left( \mathbf {d}\varvec{\rho }^k(\mathbf {x}_0) + t \cdot \mathbf {v}_3(\mathbf {x}_0) \right) \Bigg ]\\&\qquad \cdot \,\Bigg [ \sum _{\mathbf {n}^k \in H^k} \left( \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial z^k(\mathbf {x}_0 \! + \! \mathbf {n}^k)} \cdot v_1(\mathbf {x}_0 + \mathbf {n}^k) \right) + \left( \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial \mathbf {l}^k(\mathbf {x}_0)} \right) ^\top \mathbf {v}_2(\mathbf {x}_0) + \left( \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial \varvec{\rho }^k(\mathbf {x}_0)}\right) ^\top \mathbf {v}_3(\mathbf {x}_0) \Bigg ] \Bigg \} \, \mathrm {d}\mathbf {x}_0 \Biggr |_{t = 0}\\&\quad = 2 \cdot \! \! \int _{\varOmega _{0}} \sum _{c=1}^{3} \Bigg \{ \Bigg [ {\phi }^{k,c}(\mathbf {x}_0) + \sum _{\mathbf {h}^k \in H^k} \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial z^k(\mathbf {x}_0 \! + \! \mathbf {h}^k)} \; dz^{k}(\mathbf {x}_0 \! + \! \mathbf {h}^k) + \,\left( \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial \mathbf {l}^k(\mathbf {x}_0)} \right) ^\top \mathbf {dl}^k(\mathbf {x}_0) + \left( \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial \varvec{\rho }^k(\mathbf {x}_0)}\right) ^\top \mathbf {d}\varvec{\rho }^k(\mathbf {x}_0) \Bigg ] \\&\qquad \cdot \, \Bigg [ \sum _{\mathbf {n}^k \in H^k} \left( \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial z^k(\mathbf {x}_0 \! + \! \mathbf {n}^k)} \cdot v_1(\mathbf {x}_0 + \mathbf {n}^k) \right) + \left( \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial \mathbf {l}^k(\mathbf {x}_0)} \right) ^\top \mathbf {v}_2(\mathbf {x}_0) + \left( \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial \varvec{\rho }^k(\mathbf {x}_0)}\right) ^\top \mathbf {v}_3(\mathbf {x}_0) \Bigg ] \Bigg \} \, \mathrm {d}\mathbf {x}_0 \end{aligned}$$
$$\begin{aligned}&\quad {\mathop {=}\limits ^{(a)}} 2 \cdot \! \! \int _{\varOmega _{0}} \sum _{c=1}^{3} \sum _{\mathbf {m}^k \in H^k} \Bigg \{ \Bigg [ {\phi }^{k,c}(\mathbf {x}_0 \! + \! \mathbf {m}^k) + \sum _{\mathbf {h}^k \in H^k} \frac{\partial {\phi }^{k,c}(\mathbf {x}_0 \! + \! \mathbf {m}^k)}{\partial z^k(\mathbf {x}_0 \! + \! \mathbf {m}^k \! + \! \mathbf {h}^k)} dz^{k}(\mathbf {x}_0 \! + \! \mathbf {m}^k \! + \! \mathbf {h}^k) + \left( \frac{\partial {\phi }^{k,c}(\mathbf {x}_0 \! + \! \mathbf {m}^k)}{\partial \mathbf {l}^k(\mathbf {x}_0 \! + \! \mathbf {m}^k)} \right) ^\top \mathbf {dl}^k(\mathbf {x}_0 \! + \! \mathbf {m}^k)\nonumber \\&\qquad +\, \left( \frac{\partial {\phi }^{k,c}(\mathbf {x}_0 \! + \! \mathbf {m}^k)}{\partial \varvec{\rho }^k(\mathbf {x}_0 \! + \! \mathbf {m}^k)} \right) ^\top \mathbf {d}\varvec{\rho }^k(\mathbf {x}_0 \! + \! \mathbf {m}^k) \Bigg ] \cdot \,\frac{\partial {\phi }^{k,c}(\mathbf {x}_0 + \mathbf {m}^k)}{\partial z^k(\mathbf {x}_0 )} \Bigg \} \cdot v_1(\mathbf {x}_0) \nonumber \\&\quad + \,\sum _{c=1}^{3} \Bigg \{ \Bigg [ {\phi }^{k,c}(\mathbf {x}_0) + \sum _{\mathbf {h}^k \in H^k} \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial z^k(\mathbf {x}_0 \! + \! \mathbf {h}^k)} \; dz^{k}(\mathbf {x}_0 \! + \! \mathbf {h}^k) +\, \left( \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial \mathbf {l}^k(\mathbf {x}_0)} \right) ^\top \mathbf {dl}^k(\mathbf {x}_0) + \left( \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial \varvec{\rho }^k(\mathbf {x}_0)}\right) ^\top \mathbf {d}\varvec{\rho }^k(\mathbf {x}_0) \Bigg ]\nonumber \\&\qquad \cdot \, \left( \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial \mathbf {l}^k(\mathbf {x}_0)} \right) \Bigg \}^\top \mathbf {v}_2(\mathbf {x}_0) \nonumber \\&\quad + \,\sum _{c=1}^{3} \Bigg \{ \Bigg [ {\phi }^{k,c}(\mathbf {x}_0) + \sum _{\mathbf {h}^k \in H^k} \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial z^k(\mathbf {x}_0 \! + \! \mathbf {h}^k)} \; dz^{k}(\mathbf {x}_0 \! + \! \mathbf {h}^k) + \left( \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial \mathbf {l}^k(\mathbf {x}_0)} \right) ^\top \mathbf {dl}^k(\mathbf {x}_0) + \left( \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial \varvec{\rho }^k(\mathbf {x}_0)}\right) ^\top \mathbf {d}\varvec{\rho }^k(\mathbf {x}_0) \Bigg ] \nonumber \\&\qquad \cdot \, \left( \frac{\partial {\phi }^{k,c}(\mathbf {x}_0)}{\partial \varvec{\rho }^k(\mathbf {x}_0)}\right) \Bigg \}^\top \mathbf {v}_3(\mathbf {x}_0) \; \mathrm {d}\mathbf {x}_0 {\mathop {=}\limits ^{(b)}} 2 \cdot \! \! \int _{\varOmega _{0}} \frac{\delta D_{\mathsf {sfs}}^k}{\delta dz^k}(\mathbf {x}_0) \cdot v_1(\mathbf {x}_0) + \frac{\delta D_{\mathsf {sfs}}^k}{\delta \mathbf {dl}^k }(\mathbf {x}_0) ^\top \mathbf {v}_2(\mathbf {x}_0) + \frac{\delta D_{\mathsf {sfs}}^k}{\delta \mathbf {d}\varvec{\rho }^k }(\mathbf {x}_0) ^\top \mathbf {v}_3(\mathbf {x}_0) \; \mathrm {d}\mathbf {x}_0 \end{aligned}$$
(69)

Appendix C: Pseudocode

Algorithm 1 sketches the employed optimisation strategy in terms of pseudocode.

figure a

Appendix D: Model Parameters

In order to obtain the best possible reconstruction the parameters have to be adjusted properly. While the solver related parameters mainly effect the trade off between reconstruction quality and runtime performance and thus can be set fixed in advance (see Sect. 5), the model parameters have direct impact on the quality and thus must be selected more carefully. In this context, a convenient strategy is to first adjust the parameters of the pure stereo model and then extend the parameter set to the parameters of the combined model. From our experience the optimal model parameters may vary depending on the intrinsic camera parameters as well as the distance of the scene to the camera. In this context, even some of the model parameters turned out to be suitable for a wider range of images and thus can also be set fixed in advance; see Sect. 5. What remains to be adjusted are essentially the hyper parameters for the terms of the differential energy, which we specified in Table 2. The parameter settings for the method of Galliani et al. (2015) and Graber et al. (2015) are also given in Table 2. Regarding the method of Graber et al. we even optimised the level depended parameter scaling function in the python code from

$$\begin{aligned} \lambda _{l} = \lambda \cdot \left( \frac{1}{\eta }\right) ^l \quad \text{ to } \quad \lambda _{l} = \lambda \cdot \left( \frac{1}{\eta }\right) ^{1.5 \cdot l} , \end{aligned}$$
(70)

which led to a noticeable improvement of the results. In this context l denotes the current pyramid level (the finest resolution is \(l=0\)).

Table 2 Parameter settings for the different datasets and approaches

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Maurer, D., Ju, Y.C., Breuß, M. et al. Combining Shape from Shading and Stereo: A Joint Variational Method for Estimating Depth, Illumination and Albedo. Int J Comput Vis 126, 1342–1366 (2018). https://doi.org/10.1007/s11263-018-1079-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-018-1079-1

Keywords

Navigation