Skip to main content
Log in

Pose-Invariant Face Alignment via CNN-Based Dense 3D Model Fitting

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Pose-invariant face alignment is a very challenging problem in computer vision, which is used as a prerequisite for many facial analysis tasks, e.g., face recognition, expression recognition, and 3D face reconstruction. Recently, there have been a few attempts to tackle this problem, but still more research is needed to achieve higher accuracy. In this paper, we propose a face alignment method that aligns an image with arbitrary poses, by combining the powerful cascaded CNN regressors, 3D Morphable Model (3DMM), and mirrorability constraint. The core of our proposed method is a novel 3DMM fitting algorithm, where the camera projection matrix parameters and 3D shape parameters are estimated by a cascade of CNN-based regressors. Furthermore, we impose the mirrorability constraint during the CNN learning by employing a novel loss function inside the siamese network. The dense 3D shape enables us to design pose-invariant appearance features for effective CNN learning. Extensive experiments are conducted on the challenging large-pose face databases (AFLW and AFW), with comparison to the state of the art.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  • Amberg, B., Knothe, R., & Vetter, T. (2008). Expression invariant 3D face recognition with a morphable model. In FG (pp. 1–6).

  • Belhumeur, P.N., Jacobs, D.W., Kriegman, D., & Kumar, N. (2011). Localizing parts of faces using a consensus of exemplars. In CVPR (pp. 545–552).

  • Bell, S., & Bala, K. (2015). Learning visual similarity for product design with convolutional neural networks. ACM Transactions on Graphics, 34(4), 98.

    Article  Google Scholar 

  • Bromley, J., Bentz, J.W., Bottou, L., Guyon, I., LeCun, Y., Moore, C., et al. (1993). Signature verification using a siamese time delay neural network. International Journal Pattern Recognition, 7(04), 669–688.

  • Burgos-Artizzu, X.P., Perona, P., & Dollár, P. (2013). Robust face landmark estimation under occlusion. In ICCV (pp. 1513–1520).

  • Cao, C., Hou, Q., & Zhou, K. (2014). Displaced dynamic expression regression for real-time facial tracking and animation. ACM Transactions on Graphics (TOG), 33(4), 43.

    Google Scholar 

  • Cao, X., Wei, Y., Wen, F., & Sun, J. (2014). Face alignment by explicit shape regression. International Journal of Computer Vision, 107(2), 177–190.

    Article  MathSciNet  Google Scholar 

  • Cao, C., Weng, Y., Zhou, S., Tong, Y., & Zhou, K. (2014). Facewarehouse: A 3D facial expression database for visual computing. IEEE Transactions on Visualization and Computer Graphics, 20(3), 413–425.

    Article  Google Scholar 

  • Cootes, T., Taylor, C., & Lanitis, A. (1994) Active shape models: Evaluation of a multi-resolution method for improving image search. In BMVC vol. 1, (pp. 327–336).

  • Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep sparse rectifier neural networks. In Proceedings artificial intelligence and statistics (AISTATS) (pp. 315–323).

  • Hsu, G.S., Chang, K.H., & Huang, S.C. (2015). Regressive tree structured model for facial landmark localization. In ICCV (pp. 3855–3861)

  • Jeni, L.A., Cohn, J.F., & Kanade, T. (2015). Dense 3D face alignment from 2d videos in real-time. In FG (vol. 1, pp. 1–8)

  • Jeni, L.A., Tulyakov, S., Yin, L., Sebe, N., & Cohn, J.F. (2016). The first 3D face alignment in the wild (3DFAW) challenge. In ECCV (pp. 511–520).

  • Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S. & Darrell, T. (2014). Caffe: Convolutional architecture for fast feature embedding. In ACM MM, (2014) (pp. 675–678).

  • Jourabloo, A., & Liu, X. (2015). Pose-invariant 3D face alignment. In ICCV (pp. 3694–3702).

  • Jourabloo, A., & Liu, X. (2016). Large-pose face alignment via cnn-based dense 3D model fitting. In CVPR (pp. 4188–4196).

  • Jourabloo, A., Yin, X., & Liu, X. (2015). Attribute preserved face de-identification. In ICB (pp. 278–285).

  • Köstinger, M., Wohlhart, P., Roth, P.M., & Bischof, H. (2011). Annotated facial landmarks in the wild: A large-scale, real-world database for facial landmark localization. In ICCVW (pp. 2144–2151).

  • Li, H., Lin, Z., Shen, X., Brandt, J., & Hua, G. (2015) A convolutional neural network cascade for face detection. In CVPR (pp. 5325–5334).

  • Liu, X. (2009). Discriminative face alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(11), 1941–1954.

    Article  Google Scholar 

  • Liu, X. (2010). Video-based face model fitting using adaptive active appearance model. Journal of Image Vision Computing, 28(7), 1162–1172.

    Article  Google Scholar 

  • Matthews, I., & Baker, S. (2004). Active appearance models revisited. International Journal of Computer Vision, 60(2), 135–164.

    Article  Google Scholar 

  • Paysan, P., Knothe, R., Amberg, B., Romdhani, S., & Vetter, T. (2009). A 3D face model for pose and illumination invariant face recognition. In AVSS (pp. 296–301).

  • Pfister, T., Simonyan, K., Charles, J., & Zisserman, A. (2015). Deep convolutional neural networks for efficient pose estimation in gesture videos. In ACCV (pp. 538–552).

  • Phillips, P.J., Moon, H., Rizvi, S., Rauss, P.J., et al. (2000). The FERET evaluation methodology for face-recognition algorithms. IEEE Transactions Pattern Analysis and Machine Intelligence, 22(10), 1090–1104.

  • Qu, C., Monari, E., Schuchert, T., & Beyerer, J. (2015) Adaptive contour fitting for pose-invariant 3D face shape reconstruction. In BMVC (pp. 1–12).

  • Roth, J., Tong, Y., & Liu, X. (2015). Unconstrained 3D face reconstruction. In CVPR (pp. 2606–2615).

  • Roth, J., Tong, Y., & Liu, X. (2016). Adaptive 3D face reconstruction from unconstrained photo collections. In CVPR (pp. 4197–4206).

  • Saragih, J.M., Lucey, S., & Cohn, J. (2009). Face alignment through subspace constrained mean-shifts. In ICCV (pp. 1034–1041).

  • Shan, S., Chang, Y., Gao, W., Cao, B., & Yang, P. (2004). Curse of mis-alignment in face recognition: Problem and a novel mis-alignment learning solution. In FG (pp. 314–320).

  • Sun, Y., Wang, X., & Tang, X. (2013). Deep convolutional network cascade for facial point detection. In CVPR (pp. 3476–3483).

  • Taigman, Y., Yang, M., Ranzato, M., & Wolf, L. (2014). Deepface: Closing the gap to human-level performance in face verification. In CVPR (pp. 1701–1708).

  • Tulyakov, S., & Sebe, N. (2015) Regressing a 3D face shape from a single image. In ICCV (pp. 3748–3755).

  • Tzimiropoulos, G. (2015) Project-out cascaded regression with an application to face alignment. In CVPR (pp. 3659–3667).

  • Valstar, M., Martinez, B., Binefa, X., & Pantic, M. (2010) Facial point detection using boosted regression and graph models. In CVPR pp. 2729–2736.

  • Van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9, 2579–2605.

    MATH  Google Scholar 

  • Vedaldi, A., & Lenc, K. (2015). MatConvNet—convolutional neural networks for matlab. In ACM MM, (2015) (pp. 689–692).

  • Wagner, A., Wright, J., Ganesh, A., Zhou, Z., Mobahi, H., & Ma, Y. (2012). Toward a practical face recognition system: Robust alignment and illumination by sparse representation. IEEE Transactions Pattern Analysis Machine Intelligence, 34(2), 372–386.

    Article  Google Scholar 

  • Wang, N., Gao, X., Tao, D., & Li, X. (2014). Facial feature point detection: A comprehensive survey. arXiv preprint arXiv:1410.1037.

  • Wu, Y., & Ji, Q. (2015) Robust facial landmark detection under significant head poses and occlusion. In ICCV (pp. 3658–3666).

  • Xiao, J., Baker, S., Matthews, I., & Kanade, T. (2004). Real-time combined 2D+3D active appearance models. In CVPR (vol. 2, pp. 535–542).

  • Yang, H., & Patras, I. (2015). Mirror, mirror on the wall, tell me, is the error small? In CVPR (pp. 4685–4693).

  • Yang, B., Yan, J., Lei, Z., & Li, S.Z. (2015). Convolutional channel features. In ICCV (pp. 82–90).

  • Yosinski, J., Clune, J., Bengio, Y., & Lipson, H. (2014). How transferable are features in deep neural networks? In NIPS (pp. 3320–3328).

  • Yu, X., Huang, J., Zhang, S., Yan, W., & Metaxas, D.N. (2013). Pose-free facial landmark fitting via optimized part mixtures and cascaded deformable shape model. In ICCV (pp. 1944–1951).

  • Yu, X., Lin, Z., Brandt, J., & Metaxas, D.N. (2014). Consensus of regression for occlusion-robust facial feature localization. In ECCV (pp. 105–118).

  • Zagoruyko, S., & Komodakis, N. (2015). Learning to compare image patches via convolutional neural networks. In CVPR (pp. 4353–4361).

  • Zhang, Z., Luo, P., Loy, C.C., & Tang, X. (2014). Facial landmark detection by deep multi-task learning. In ECCV (pp. 94–108).

  • Zhang, J., Shan, S., Kan, M., & Chen, X. (2014). Coarse-to-fine auto-encoder networks (CFAN) for real-time face alignment. In ECCV (pp. 1–16).

  • Zhang, J., Zhou, S.K., Comaniciu, D., & McMillan, L. (2008). Conditional density learning via regression with application to deformable shape segmentation. In CVPR (pp. 1–8).

  • Zhou, E., Fan, H., Cao, Z., Jiang, Y., & Yin, Q. (2013). Extensive facial landmark localization with coarse-to-fine convolutional network cascade. In ICCVW (pp. 386–391).

  • Zhu, X., & Ramanan, D. (2012). Face detection, pose estimation, and landmark localization in the wild. In CVPR (pp. 2879–2886).

  • Zhu, X., Lei, Z., Yan, J., Yi, D., & Li, S.Z. (2015). High-fidelity pose and expression normalization for face recognition in the wild. In CVPR (pp. 787–796).

  • Zhu, S., Li, C., Change Loy, C., & Tang, X. (2015). Face alignment by coarse-to-fine shape searching. In CVPR (pp. 4998–5006).

  • Zhu, X., Yan, J., Yi, D., Lei, Z., & Li, S.Z. (2015). Discriminative 3D morphable model fitting. In FG (pp. 1–8).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaoming Liu.

Additional information

Communicated by Xiaoou Tang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jourabloo, A., Liu, X. Pose-Invariant Face Alignment via CNN-Based Dense 3D Model Fitting. Int J Comput Vis 124, 187–203 (2017). https://doi.org/10.1007/s11263-017-1012-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-017-1012-z

Keywords

Navigation