Skip to main content
Log in

Gradient Shape Model

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

For years, the so-called Constrained Local Model (CLM) and its variants have been the gold standard in face alignment tasks. The CLM combines an ensemble of local feature detectors whose locations are regularized by a shape model. Fitting such a model typically consists of an exhaustive local search using the detectors and a global optimization that finds the CLM’s parameters that jointly maximize all the responses. However, one major drawback of CLMs is the inefficiency of the local search, which relies on a large amount of expensive convolutions. This paper introduces the Gradient Shape Model (GSM), a novel approach that addresses this limitation. We are able to align a similar CLM model without the need for any convolutions at all. We also use true analytical gradient and Hessian matrices, which are easy to compute, instead of their approximations. Our formulation is very general, allowing an optional 3D shape term to be seamlessly included. Additionally, we expand the GSM formulation through a cascade regression framework. This revised technique allows a substantially reduction in the complexity/dimensionality of the data term, making it possible to compute a denser, more accurate, regression step per cascade level. Experiments in several standard datasets show that our proposed models perform faster than state-of-the-art CLMs and better than recent cascade regression approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. It is worth mentioning that some authors use a slightly different pose parametrization \((\varvec{ \theta }'= [a-1,\, b,\, t_x, \, t_y]^T)\) that allows to append to \(\Phi \) a special set of 4 eigenvectors that linearly model the 2D pose (Matthews and Baker 2004).

  2. The LFW dataset was excluded due the lack of landmark annotations in the face outer region.

References

  • Akhter, I., Sheikh, Y., Khan, S., & Kanade, T. (2008). Nonrigid structure from motion in trajectory space. In Neural information processing systems.

  • Alabort-i-Medina, J., & Zafeiriou, S. (2017). A unified framework for compositional fitting of active appearance models. International Journal of Computer Vision, 121(1), 26–64.

    Article  Google Scholar 

  • Asthana, A., Zafeiriou, S., Cheng, S., & Pantic, M. (2013). Robust discriminative response map fitting with constrained local models. In IEEE conference on computer vision and pattern recognition.

  • Baker, S., & Matthews, I. (2001). Equivalence and efficiency of image alignment algorithms. In IEEE conference on computer vision and pattern recognition.

  • Baltrušaitis, T., Robinson, P., & Morency, L. (2013). Constrained local neural fields for robust facial landmark detection in the wild. In IEEE international conference on computer vision workshop, 300 faces in-the-wild challenge (300-W).

  • Belhumeur, P. N., Jacobs, D. W., Kriegman, D. J., & Kumar, N. (2011). Localizing parts of faces using a consensus of exemplars. In IEEE conference on computer vision and pattern recognition.

  • Boddeti, V. N., Kanade, T., & Kumar, B. V. K. V. (2013). Correlation filters for object alignment. In IEEE conference on computer vision and pattern recognition.

  • Bolme, D. S., Beveridge, J. R., Draper, B. A., & Lui, Y. M. (2010). Visual object tracking using adaptive correlation filters. In IEEE conference on computer vision and pattern recognition.

  • Bulat, A., & Tzimiropoulos, G. (2017a). Binarized convolutional landmark localizers for human pose estimation and face alignment with limited resources. In IEEE international conference on computer vision.

  • Bulat, A., & Tzimiropoulos, G. (2017b). How far are we from solving the 2d and 3d face alignment problem? (and a dataset of 230,000 3d facial landmarks). In IEEE international conference on computer vision.

  • Burgos-Artizzu, X. P., Perona, P., & Dollár, P. (2013). Robust face landmark estimation under occlusion. In IEEE international conference on computer vision.

  • Cao, X., Wei, Y., Wen, F., & Sun, J. (2012). Face alignment by explicit shape regression. In IEEE conference on computer vision and pattern recognition.

  • Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5), 603–619.

    Article  Google Scholar 

  • Cootes, T. F., & Taylor, C. J. (2004). Statistical models of appearance for computer vision. Technical report, Imaging Science and Biomedical Engineering, University of Manchester.

  • Cootes, T. F., Taylor, C. J., Cooper, D. H., & Graham, J. (1995). Active shape models—Their training and application. Computer Vision and Image Understanding, 61(1), 38–59.

    Article  Google Scholar 

  • Cootes, T. F., Edwards, G. J., & Taylor, C. J. (2001). Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 681–685.

    Article  Google Scholar 

  • Cootes, T. F., Ionita, M., Lindner, C., & Sauer, P. (2012). Robust and accurate shape model fitting using random forest regression voting. In European conference on computer vision.

  • Cristinacce, D., & Cootes, T. F. (2006). Feature detection and tracking with constrained local models. In British machine vision conference.

  • Cristinacce, D., & Cootes, T. F. (2007). Boosted regression active shape models. In British machine vision conference.

  • Cristinacce, D., & Cootes, T. F. (2008). Automatic feature localisation with constrained local models. Pattern Recognition, 41(10), 3054–3067.

    Article  Google Scholar 

  • Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE conference on computer vision and pattern recognition.

  • Dantone, M., Gall, J., Fanelli, G., & Gool, L. V. (2012). Real-time facial feature detection using conditional regression forests. In IEEE conference on computer vision and pattern recognition.

  • Face++. (2018). Megvii face API. http://www.faceplusplus.com.

  • Fan, H., & Zhou, E. (2016). Approaching human level facial landmark localization by deep learning. Image and Vision Computing, 47, 27–35.

    Article  Google Scholar 

  • Fanelli, G., Dantone, M., & Gool L. V. (2013). Real time 3d face alignment with random forests-based active appearance models. In IEEE international conference on automatic face and gesture recognition.

  • Galoogahi, H. K., Sim, T., & Lucey, S. (2013). Multi-channel correlation filters. In IEEE international conference on computer vision.

  • Gu, L., Kanade, T. (2008). A generative shape regularization model for robust face alignment. In European conference on computer vision.

  • Guo, P. (2014). https://github.com/phg1024/CSCE625/tree/master/finalproject.

  • Henriques, J. F., Carreira, J., Caseiro, R., Batista, J. (2013). Beyond hard negative mining: Efficient detector learning via block-circulant decomposition. In IEEE international conference on computer vision.

  • Huang, G. B., Ramesh, M., Berg, T., & Learned-Miller, E. (2007). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical report 07-49, University of Massachusetts, Amherst.

  • Huang, Z., Zhou, E., & Cao, Z. (2015). Coarse-to-fine face alignment with multi-scale local patch regression. arXiv:1511.04901.

  • Jacobs, H. O. (2014). How to stare at the higher-order n-dimensional chain rule without losing your marbles. Technical report. arXiv:1410.3493v3.

  • Jourabloo, A., & Liu, X. (2015). Pose-invariant 3d face alignment. In IEEE international conference on computer vision.

  • Kazemi, V., & Sullivan, J. (2014). One millisecond face alignment with an ensemble of regression trees. In IEEE conference on computer vision and pattern recognition.

  • Le, V., Brandt, J., Lin, Z., Boudev, L., & Huang, T. S. (2012). Interactive facial feature localization. In European conference on computer vision.

  • Lee, D., Park, H., & Yoo, C. D. (2015). Face alignment using cascade Gaussian process regression trees. In IEEE conference on computer vision and pattern recognition.

  • Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.

    Article  Google Scholar 

  • Martins, P., Caseiro, R., Henriques, J. F., & Batista, J. (2012). Discriminative Bayesian active shape models. In European conference on computer vision.

  • Martins, P., Caseiro, R., & Batista, J. (2014). Non-parametric Bayesian constrained local models. In IEEE conference on computer vision and pattern recognition.

  • Martins, P., Henriques, J. F., Caseiro, R., & Batista, J. (2016). Bayesian constrained local models revisited. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(4), 704–716.

    Article  Google Scholar 

  • Matthews, I., & Baker, S. (2004). Active appearance models revisited. International Journal of Computer Vision, 60(1), 135–164.

    Article  Google Scholar 

  • Messer, K., Matas, J., Kittler, J., Luettin, J., & Maitre, G. (1999). XM2VTSDB: The extended M2VTS database. In International conference on audio and video-based biometric person authentication.

  • Newell, A., Yang, K., & Deng, J. (2016). Stacked hourglass networks for human pose estimation. In European conference on computer vision.

  • Nordstrom, M., Larsen, M., Sierakowski, J., & Stegmann, M. (2004). The IMM face database—An annotated dataset of 240 face images. Technical report. Technical University of Denmark, DTU.

  • Paquet, U. (2009). Convexity and Bayesian constrained local models. In IEEE conference on computer vision and pattern recognition.

  • Ren, S., Cao, X., Wei, Y., & Sun, J. (2014). Face alignment at 3000 fps via regressing local binary features. In IEEE conference on computer vision and pattern recognition.

  • Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., & Pantic, M. (2013a). 300 faces in-the-wild challenge: The first facial landmark localization challenge. In IEEE international conference on computer vision workshops.

  • Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., & Pantic, M. (2013b). A semi-automatic methodology for facial landmark annotation. In IEEE conference on computer vision and pattern recognition workshops

  • Sagonas, C., Antonakos, E., Tzimiropoulos, G., & Pantic, M. (2016). 300 faces in-the-wild challenge: Database and results. Image and Vision Computing, Special Issue on Facial Landmark Localisation ‘In-The-Wild’, 47, 3–18.

    Google Scholar 

  • Sánchez-Lozano, E., Tzimiropoulos, G., Martinez, B., De la Torre, F., & Valstar, M. (2018). A functional regression approach to facial landmark tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2017.2745568.

  • Saragih, J., Lucey, S., & Cohn, J. (2009). Face alignment through subspace constrained mean-shifts. In IEEE international conference on computer vision.

  • Saragih, J., Lucey, S., & Cohn, J. (2010). Deformable model fitting by regularized landmark mean-shifts. International Journal of Computer Vision, 91(2), 200–215.

    Article  MathSciNet  Google Scholar 

  • Silverman, B. W. (1986). Density estimation for statistics and data analysis. London: Chapman and Hall.

    Book  Google Scholar 

  • Songsri-in, K., Trigeorgis, G., & Zafeiriou, S. (2018). Deep & deformable: Convolutional mixtures of deformable part-based models. In IEEE conference on automatic face and gesture recognition.

  • Sun, Y., Wang, X., & Tang, X. (2013). Deep convolutional network cascade for facial point detection. In IEEE conference on computer vision and pattern recognition.

  • Trigeorgis, G., Snape, P., Nicolaou, M., Antonakos, E., & Zafeiriou, S. (2016). Mnemonic descent method: A recurrent process applied for end-to-end face alignment. In IEEE conference on computer vision and pattern recognition.

  • Tzimiropoulos, G. (2015). Project-out cascaded regression with an application to face alignment. In IEEE conference on computer vision and pattern recognition.

  • Tzimiropoulos, G., & Pantic, M. (2014). Gauss–Newton deformable part models for face alignment in-the-wild. In IEEE conference on computer vision and pattern recognition.

  • Tzimiropoulos, G., & Pantic, M. (2017). Fast algorithms for fitting active appearance models to unconstrained images. International Journal of Computer Vision, 122(1), 17–33.

    Article  MathSciNet  Google Scholar 

  • Tzimiropoulos, G., i Medina, J. A., Zafeiriou, S., & Pantic, M. (2012). Generic active appearance models revisited. In Asian conference on computer vision.

  • Valstar, M. F., Martinez, B., Binefa, X., & Pantic, M. (2010). Facial point detection using boosted regression and graph models. In IEEE conference on computer vision and pattern recognition.

  • Viola, P., & Jones, M. (2002). Robust real-time object detection. International Journal of Computer Vision, 57(2), 137–154.

    Article  Google Scholar 

  • Wang, Y., Lucey, S., & Cohn, J. (2008). Enforcing convexity for improved alignment with constrained local models. In IEEE conference on computer vision and pattern recognition.

  • Xiao, J., Baker, S., Matthews, I., & Kanade, T. (2004a). Real-time combined 2d+3d active appearance models. In IEEE conference on computer vision and pattern recognition.

  • Xiao, J., Chai, J., & Kanade, T. (2004b). A closed-form solution to non-rigid shape and motion recovery. In European conference on computer vision.

  • Xiong, X., & De la Torre, F. (2013). Supervised descent method and its application to face alignment. In IEEE conference on computer vision and pattern recognition.

  • Xiong, X., & De la Torre, F. (2014). Supervised descent method for solving nonlinear least squares problems in computer vision. Technical report. arXiv:1405.0601.

  • Xiong, X., & De la Torre, F. (2015). Global supervised descent method. In IEEE conference on computer vision and pattern recognition.

  • Zadeh, A., Baltrušaitis, T., & Morency, L. P. (2017). Convolutional experts constrained local model for facial landmark detection. In IEEE computer vision and pattern recognition workshop (CVPRW), 2nd facial landmark localisation competition.

  • Zhang, J., Shan, S., Kan, M., & Chen, X. (2014a). Coarse-to-fine auto-encoder networks (CFAN) for real-time face alignment. In European conference on computer vision.

  • Zhang, Z., Luo, P., Loy, C. C., & Tang, X. (2014b). Facial landmark detection by deep multi-task learning. In European conference on computer vision.

  • Zhang, Z., Luo, P., Loy, C. C., & Tang, X. (2016). Learning deep representation for face alignment with auxiliary attributes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38, 918–930.

    Article  Google Scholar 

  • Zhou, E., Fan, H., Cao, Z., Jiang, Y., & Yin, Q. (2013a). Extensive facial landmark localization with coarse-to-fine convolutional network cascade. In IEEE international conference on computer vision workshop, 300 faces in-the-wild challenge (300-W).

  • Zhou, F., Brandt, J., & Lin, Z. (2013b). Exemplar-based graph matching for robust facial landmark localization. In IEEE international conference on computer vision.

  • Zhu, S., Li, C., Loy, C., & Tang, X. (2015). Face alignment by coarse-to-fine shape searching. In IEEE conference on computer vision and pattern recognition.

  • Zhu, X., & Ramanan, D. (2012). Face detection, pose estimation, and landmark localization in the wild. In IEEE conference on computer vision and pattern recognition.

Download references

Acknowledgements

This work was supported by the Portuguese Science Foundation (FCT) through the Grant SFRH/BPD/90200/2012 (P. Martins).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pedro Martins.

Additional information

Communicated by Ming-Hsuan Yang.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Gradient Definitions

Gradient Definitions

1.1 Hessian of the 2D Regularization Term (\(\mathbf{H}_{\text {R}}\))

The Hessian of the 2D regularization term is a \((2v+4)\) square matrix of the form:

$$\begin{aligned} \mathbf{H}_{\text {R}} = \left[ \begin{array}{ccccccccc} \ddots &{} &{} \ldots &{} &{} &{} \vdots &{} \vdots &{} \vdots &{} \vdots \\ &{} \frac{\partial ^2 R}{\partial x_i^2} &{} &{} &{} &{} \frac{\partial ^2 R}{\partial x_i \partial a} &{} \frac{\partial ^2 R}{\partial x_i \partial b} &{} \frac{\partial ^2 R}{\partial x_i \partial t_x} &{} \frac{\partial ^2 R}{\partial x_i \partial t_y}\\ \vdots &{} &{} \ddots &{} &{} \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots \\ &{} &{} &{} \frac{\partial ^2 R}{\partial y_i^2} &{} &{} \frac{\partial ^2 R}{\partial y_i \partial a} &{} \frac{\partial ^2 R}{\partial y_i \partial b} &{} \frac{\partial ^2 R}{\partial y_i \partial t_x} &{} \frac{\partial ^2 R}{\partial y_i \partial t_y}\\ &{} &{} \ldots &{} &{} \ddots &{} \vdots &{} \vdots &{} \vdots &{} \vdots \\ \ldots &{} \frac{\partial ^2 R}{\partial a \partial x_i} &{} \ldots &{} \frac{\partial ^2 R}{\partial a \partial y_i} &{} \ldots &{} \frac{\partial ^2 R}{\partial a^2} &{} \frac{\partial ^2 R}{\partial a \partial b} &{} \frac{\partial ^2 R}{\partial a \partial t_x} &{} \frac{\partial ^2 R}{\partial a \partial t_y} \\ \ldots &{} \frac{\partial ^2 R}{\partial b \partial x_i} &{} \ldots &{} \frac{\partial ^2 R}{\partial b \partial y_i} &{} \ldots &{} \frac{\partial ^2 R}{\partial b \partial a} &{} \frac{\partial ^2 R}{\partial b^2} &{} \frac{\partial ^2 R}{\partial b \partial t_x} &{} \frac{\partial ^2 R}{\partial b \partial t_y} \\ \ldots &{} \frac{\partial ^2 R}{\partial t_x \partial x_i} &{} \ldots &{} \frac{\partial ^2 R}{\partial t_x \partial y_i} &{} \ldots &{} \frac{\partial ^2 R}{\partial t_x \partial a} &{} \frac{\partial ^2 R}{\partial t_x \partial b} &{} \frac{\partial ^2 R}{\partial t_x^2} &{} \frac{\partial ^2 R}{\partial t_x \partial t_y} \\ \ldots &{} \frac{\partial ^2 R}{\partial t_y \partial x_i} &{} \ldots &{} \frac{\partial ^2 R}{\partial t_y \partial y_i} &{} \ldots &{} \frac{\partial ^2 R}{\partial t_y \partial a} &{} \frac{\partial ^2 R}{\partial t_y \partial b} &{} \frac{\partial ^2 R}{\partial t_y \partial t_x} &{} \frac{\partial ^2 R}{\partial t_y^2} \end{array} \right] \end{aligned}$$
(52)

where the main \((2v \times 2v)\) sub-matrix (constant, therefore can be precomputed) is

$$\begin{aligned} \frac{\partial ^2 R}{\partial \mathbf{s}^2} = 2 \Sigma _{\mathbf{s}}^{-1}. \end{aligned}$$
(53)

The 2D pose diagonal terms are given by

$$\begin{aligned} \frac{\partial ^2 R}{\partial a^2}= & {} \left( \frac{\partial \mathbf{s}_\text {BM}}{\partial a} \right) ^T \frac{\partial ^2 R}{\partial \mathbf{s}_\text {BM}^2} \frac{\partial \mathbf{s}_\text {BM}}{\partial a} \nonumber \\= & {} 2 \left( \mathbf{s} - \mathbf{s}_m \right) ^T \Sigma _{\mathbf{s}}^{-1} \left( \mathbf{s} - \mathbf{s}_m \right) \end{aligned}$$
(54)
$$\begin{aligned} \frac{\partial ^2 R}{\partial b^2}= & {} \left( \frac{\partial \mathbf{s}_\text {BM}}{\partial b} \right) ^T \frac{\partial ^2 R}{\partial \mathbf{s}_\text {BM}^2} \frac{\partial \mathbf{s}_\text {BM}}{\partial b} \nonumber \\= & {} 2 \left( \frac{\mathbf{s}_m^y - \mathbf{s}^y}{\mathbf{s}^x - \mathbf{s}_m^x} \right) ^T \Sigma _{\mathbf{s}}^{-1} \left( \frac{\mathbf{s}_m^y - \mathbf{s}^y }{\mathbf{s}^x - \mathbf{s}_m^x} \right) \end{aligned}$$
(55)
$$\begin{aligned} \frac{\partial ^2 R}{\partial t_x^2}= & {} \left( \frac{\partial \mathbf{s}_\text {BM}}{\partial t_x} \right) ^T \frac{\partial ^2 R}{\partial \mathbf{s}_\text {BM}^2} \frac{\partial \mathbf{s}_\text {BM}}{\partial t_x} = 2 \left( \frac{\mathbf{1 }_v}{\mathbf{0 }_v} \right) ^T \Sigma _{\mathbf{s}}^{-1} \left( \frac{\mathbf{1 }_v}{\mathbf{0 }_v}\right) \nonumber \\ \end{aligned}$$
(56)
$$\begin{aligned} \frac{\partial ^2 R}{\partial t_y^2}= & {} \left( \frac{\partial \mathbf{s}_\text {BM}}{\partial t_y} \right) ^T \frac{\partial ^2 R}{\partial \mathbf{s}_\text {BM}^2} \frac{\partial \mathbf{s}_\text {BM}}{\partial t_y} = 2 \left( \frac{\mathbf{0 }_v }{\mathbf{1 }_v} \right) ^T \Sigma _{\mathbf{s}}^{-1} \left( \frac{\mathbf{0 }_v}{\mathbf{1 }_v}\right) \nonumber \\ \end{aligned}$$
(57)

where \(\mathbf{0 }_v\) and \(\mathbf{1 }_v\) are v sized vectors filled with zeros and ones, respectively. In the previous, \(\mathbf{s}^x\) and \(\mathbf{s}^y\) represent the x and y components (v sized vectors) of the shape \(\mathbf{s}\). Additionally, note that \(\mathbf{s}_m\) (2v expanded vector that defines the base mesh centre of mass) is constant.

The 2D pose mixed terms are given by

$$\begin{aligned} \frac{\partial ^2 R}{\partial a \partial b}= & {} \frac{\partial ^2 R}{\partial b \partial a} = 2 \left( \mathbf{s} - \mathbf{s}_m \right) \Sigma _{\mathbf{s}}^{-1} \left( \frac{\mathbf{s}_m^y - \mathbf{s}^y}{\mathbf{s}^x - \mathbf{s}_m^x} \right) \end{aligned}$$
(58)
$$\begin{aligned} \frac{\partial ^2 R}{\partial a \partial t_x}= & {} \frac{\partial ^2 R}{\partial t_x \partial a} = 2 \left( \mathbf{s} - \mathbf{s}_m \right) \Sigma _{\mathbf{s}}^{-1} \left( \frac{\mathbf{1 }_v}{\mathbf{0 }_v} \right) \end{aligned}$$
(59)
$$\begin{aligned} \frac{\partial ^2 R}{\partial a \partial t_y}= & {} \frac{\partial ^2 R}{\partial t_y \partial a} = 2 \left( \mathbf{s} - \mathbf{s}_m \right) \Sigma _{\mathbf{s}}^{-1} \left( \frac{\mathbf{0 }_v}{\mathbf{1 }_v} \right) \end{aligned}$$
(60)
$$\begin{aligned} \frac{\partial ^2 R}{\partial b \partial t_x}= & {} \frac{\partial ^2 R}{\partial t_x \partial b} = 2 \left( \frac{\mathbf{s}_m^y - \mathbf{s}^y}{\mathbf{s}^x - \mathbf{s}_m^x} \right) \Sigma _{\mathbf{s}}^{-1} \left( \frac{\mathbf{1 }_v}{\mathbf{0 }_v} \right) \end{aligned}$$
(61)
$$\begin{aligned} \frac{\partial ^2 R}{\partial b \partial t_y}= & {} \frac{\partial ^2 R}{\partial t_y \partial b} = 2 \left( \frac{\mathbf{s}_m^y - \mathbf{s}^y}{\mathbf{s}^x - \mathbf{s}_m^x } \right) \Sigma _{\mathbf{s}}^{-1} \left( \frac{\mathbf{0 }_v}{\mathbf{1 }_v}\right) \end{aligned}$$
(62)
$$\begin{aligned} \frac{\partial ^2 R}{\partial t_x \partial t_y}= & {} \frac{\partial ^2 R}{\partial t_y \partial t_x} = 2 \left( \frac{\mathbf{1 }_v}{\mathbf{0 }_v}\right) ^T \Sigma _{\mathbf{s}}^{-1} \left( \frac{\mathbf{0 }_v}{\mathbf{1 }_v}\right) . \end{aligned}$$
(63)

Finally, the remaining mixed terms, are

$$\begin{aligned} \frac{\partial ^2 R}{\partial x_i \partial a}= & {} \frac{\partial ^2 R}{\partial a \partial x_i} = 2 \left( \frac{a \varvec{\delta }_i }{b \varvec{\delta }_{i}} \right) \Sigma _{\mathbf{s}}^{-1} \left( \mathbf{s} - \mathbf{s}_m \right) \nonumber \\&+ 2 \left( \frac{\varvec{\delta }_i }{\mathbf{0 }_v} \right) \Sigma _{\mathbf{s}}^{-1} ( \mathbf{s}_\text {BM} - \mathbf{s}_0 )\end{aligned}$$
(64)
$$\begin{aligned} \frac{\partial ^2 R}{\partial y_i \partial a}= & {} \frac{\partial ^2 R}{\partial a \partial y_i} = 2 \left( \frac{-b \varvec{\delta }_i }{a \varvec{\delta }_{i}} \right) \Sigma _{\mathbf{s}}^{-1} \left( \mathbf{s} - \mathbf{s}_m \right) \nonumber \\&+ 2 \left( \frac{\mathbf{0 }_v}{\varvec{\delta }_i } \right) \Sigma _{\mathbf{s}}^{-1} ( \mathbf{s}_\text {BM} - \mathbf{s}_0 ) \end{aligned}$$
(65)
$$\begin{aligned} \frac{\partial ^2 R}{\partial x_i \partial b}= & {} \frac{\partial ^2 R}{\partial b \partial x_i} = 2 \left( \frac{a \varvec{\delta }_i}{b \varvec{\delta }_{i}} \right) \Sigma _{\mathbf{s}}^{-1} \left( \frac{\mathbf{s}_m^y - \mathbf{s}^y}{\mathbf{s}^x - \mathbf{s}_m^x} \right) \nonumber \\&+ \left( \frac{ \mathbf{0 }_v}{\varvec{\delta }_i } \right) \Sigma _{\mathbf{s}}^{-1} ( \mathbf{s}_\text {BM} - \mathbf{s}_0 )\end{aligned}$$
(66)
$$\begin{aligned} \frac{\partial ^2 R}{\partial y_i \partial b}= & {} \frac{\partial ^2 R}{\partial b \partial y_i} = 2 \left( \frac{ -b \varvec{\delta }_i}{ a \varvec{\delta }_{i} } \right) \Sigma _{\mathbf{s}}^{-1} \left( \frac{ \mathbf{s}_m^y - \mathbf{s}^y}{ \mathbf{s}^x - \mathbf{s}_m^x } \right) \nonumber \\+ & {} \left( \frac{ -\varvec{\delta }_i}{ \mathbf{0 }_v } \right) \Sigma _{\mathbf{s}}^{-1} ( \mathbf{s}_\text {BM} - \mathbf{s}_0 )\end{aligned}$$
(67)
$$\begin{aligned} \frac{\partial ^2 R}{\partial x_i \partial t_x}= & {} \frac{\partial ^2 R}{\partial t_x \partial x_i} = 2 \left( \frac{ a \varvec{\delta }_i}{ b \varvec{\delta }_{i} } \right) \Sigma _{\mathbf{s}}^{-1} \left( \frac{ \mathbf{1 }_v}{ \mathbf{0 }_v } \right) \end{aligned}$$
(68)
$$\begin{aligned} \frac{\partial ^2 R}{\partial y_i \partial t_x}= & {} \frac{\partial ^2 R}{\partial t_x \partial y_i} = 2 \left( \frac{ -b \varvec{\delta }_i}{ a \varvec{\delta }_{i} } \right) \Sigma _{\mathbf{s}}^{-1} \left( \frac{ \mathbf{1 }_v}{ \mathbf{0 }_v } \right) . \end{aligned}$$
(69)

where \(\varvec{\delta }_i\) is a v-dimensional vector filled with zeros, except on a scalar of 1 at the ith element location.

1.2 Gradients of the 2D \(+\) 3D Model

The gradient of the 3D regularization term, in Eq. 42, is

$$\begin{aligned} \nabla _{\text {R3D}}(\overline{\mathbf{s}}) = \left[ \begin{array}{ccc} \mathbf{0 }_{2v+4}&2 \Sigma _{\overline{\mathbf{s}}}^{-1} (\overline{\mathbf{s}} - \overline{\mathbf{s}}_0)&\mathbf{0 }_6 \end{array} \right] . \end{aligned}$$
(70)

Recalling gradient of the 3D to 2D projection error, defined by a \((2v) \times (2v+4+3v+6)\) matrix as

$$\begin{aligned} \nabla \mathbf{r } = \left[ \begin{array}{ccccccccc} \frac{\partial \mathbf{r} }{\partial \mathbf{s}}&\frac{\partial \mathbf{r} }{\partial \varvec{ \theta }}&\frac{\partial \mathbf{r} }{\partial \overline{\mathbf{s}}}&\frac{\partial \mathbf{r} }{\partial \sigma }&\frac{\partial \mathbf{r} }{\partial \Delta \theta _x}&\frac{\partial \mathbf{r} }{\partial \Delta \theta _y}&\frac{\partial \mathbf{r} }{\partial \Delta \theta _z}&\frac{\partial \mathbf{r} }{\partial o_x}&\frac{\partial \mathbf{r} }{\partial o_y} \end{array} \right] \end{aligned}$$

with

$$\begin{aligned}&\frac{\partial \mathbf{r} }{\partial \mathbf{s}} = \mathbf{I}_{2v}, \quad \frac{\partial \mathbf{r} }{\partial \varvec{ \theta }} = \mathbf{0 }_{2v}, \quad \frac{\partial \mathbf{r} }{\partial \overline{\mathbf{s}}} = -\mathbf{P} \otimes \mathbf{I}_v, \quad \frac{\partial \mathbf{r} }{\partial \sigma } = - \mathbf{I}_v \otimes \mathbf{R} _o ~ \overline{\mathbf{s}}, \\&\frac{\partial \mathbf{r} }{\partial \Delta \theta _x} = \mathbf{I}_v \otimes \left( \mathbf{P} \left[ \begin{array}{ccc} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} -1 \\ 0 &{} 1 &{} 0 \end{array} \right] \right) \overline{\mathbf{s}}, \frac{\partial \mathbf{r} }{\partial \Delta \theta _y} = \mathbf{I}_v \otimes \left( \mathbf{P} \left[ \begin{array}{ccc} 0 &{} 0 &{} 1\\ 0 &{} 0 &{} 0 \\ -1 &{} 0 &{} 0 \end{array}\right] \right) \overline{\mathbf{s}}, \\&\frac{\partial \mathbf{r} }{\partial \Delta \theta _z} = \mathbf{I}_v \otimes \left( \mathbf{P} \left[ \begin{array}{ccc} 0 &{} -1 &{} 0\\ 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 \end{array} \right] \right) \overline{\mathbf{s}}, \frac{\partial \mathbf{r} }{\partial o_x} = \left( \frac{ - \mathbf{1 }_v}{ \mathbf{0 }_v } \right) , \frac{\partial \mathbf{r} }{\partial o_y} = \left( \frac{ \mathbf{0 }_v}{ - \mathbf{1 }_v } \right) \end{aligned}$$

where \(\mathbf{I}_{n}\) represents a n dimensional identity matrix and the \(\otimes \) symbol is the Kronecker product.

Finally, the Hessian of the 3D shape regularization term (in Eq. 43) is a \((2v+4+3v+6)\) square matrix given by

$$\begin{aligned} \mathbf{H}_{\text {R3D}}(\overline{\mathbf{s}}) = \left[ \begin{array}{cccc} \mathbf{0 }_{2v} &{}\quad \mathbf{0 }_{4} &{}\quad \mathbf{0 }_{3v} &{}\quad \mathbf{0 }_6 \\ \mathbf{0 }_{2v} &{}\quad \mathbf{0 }_{4} &{}\quad \mathbf{0 }_{3v} &{}\quad \mathbf{0 }_6\\ \mathbf{0 }_{2v} &{}\quad \mathbf{0 }_{4} &{}\quad 2 \Sigma _{\overline{\mathbf{s}}}^{-1} &{}\quad \mathbf{0 }_6\\ \mathbf{0 }_{2v} &{}\quad \mathbf{0 }_{4} &{}\quad \mathbf{0 }_{3v} &{}\quad \mathbf{0 }_6\\ \end{array} \right] \end{aligned}$$
(71)

note that this matrix is constant, therefore, it can be precomputed.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Martins, P., Henriques, J.F. & Batista, J. Gradient Shape Model. Int J Comput Vis 128, 2828–2848 (2020). https://doi.org/10.1007/s11263-020-01341-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-020-01341-y

Keywords

Navigation