Abstract
Facial landmark detection has played an important role in many face understanding tasks, such as face verification, facial expression recognition, age estimation et al. Model initialization and feature extraction are crucial in supervised landmark detection. Mismatching caused by detector error and discrepant initialization is very common in these existing methods. To solve this problem, we have proposed a new method called multi-task feature learning-based improved supervised descent method (MtFL-iSDM) for the robust facial landmark localization. In this new method, firstly, a fast detection will be processed to locate the eyes and mouth, and the initialization model will adapt to the real location according to fast facial points detection. Secondly, multi-task feature learning is adopted on our improved supervised descent method model to achieve a better performance. Experiments on four benchmark databases show that our method achieves state-of-the-art performance.
Similar content being viewed by others
References
Asthana, A., Zafeiriou, S., Cheng, S., Pantic, M.: Robust discriminative response map fitting with constrained local models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3444–3451 (2013)
Belharbi, S., Chatelain, C., Herault, R., Adam, S.: Facial landmark detection using structured output deep neural networks. arXiv preprint arXiv:1504.07550 (2015)
Belhumeur, P.N., Jacobs, D.W., Kriegman, D.J., Kumar, N.: Localizing parts of faces using a consensus of exemplars. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2930–2940 (2013)
Blake, A., Isard, M.: Active Shape Models. Active Contours. Springer, London (1998)
Burgos-Artizzu, X.P., Perona, P., Dollr, P.: Robust face landmark estimation under occlusion. In: 2013 IEEE International Conference on Computer Vision, pp. 1513–1520 (2013)
Cristinacce, D., Cootes, T.F.: Feature detection and tracking with constrained local models. In: BMVC, vol. 1, p. 3 (2006)
Dantone, M., Gall, J., Fanelli, G., Gool, L.V.: Real-time facial feature detection using conditional regression forests. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2578–2585 (2012)
Dollár, P., Welinder, P., Perona, P.: Cascaded pose regression. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1078–1085 (2010)
Jolliffe, I.: Principal Component Analysis. Wiley Online Library, New York (2002)
Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)
Kumar, A., Ranjan, R., Patel, V., Chellappa, R.: Face alignment by local deep descriptor regression. arXiv preprint arXiv:1601.07950 (2016)
Le, V., Brandt, J., Lin, Z., Bourdev, L., Huang, T.S.: Interactive facial feature localization. In: European Conference on Computer Vision, pp. 679–692. Springer (2012)
Liu, C.: Gabor-based kernel pca with fractional power polynomial models for face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 26(5), 572–581 (2004)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Matthews, I., Baker, S.: Active appearance models revisited. Int. J. Comput. Vis. 60(2), 135–164 (2004)
Ozawa, S., Roy, A., Roussinov, D.: A multitask learning model for online pattern recognition. IEEE Trans. Neural Netw. 20(3), 430–445 (2009)
Peng, X., Feris, R.S., Wang, X., Metaxas, D.N.: A recurrent encoder-decoder network for sequential face alignment. In: ECCV (2016)
Prakash, O., Gupta, P.: Robust facial landmark detection using a mixture of synthetic and real images with dynamic weighting: A survey. Science, Engineering and Technology pp. 1–25 (2016)
Ren, S., Cao, X., Wei, Y., Sun, J.: Face alignment at 3000 fps via regressing local binary features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1685–1692 (2014)
Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., Pantic, M.: 300 faces in-the-wild challenge: the first facial landmark localization challenge. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 397–403 (2013)
Sun, Y., Wang, X., Tang, X.: Deep convolutional network cascade for facial point detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3476–3483 (2013)
Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004)
Wu, Y., Ji, Q.: Constrained joint cascade regression framework for simultaneous facial action unit recognition and facial landmark detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Wu, Y., Wang, Z., Ji, Q.: Facial feature tracking under varying facial expressions and face poses based on restricted boltzmann machines. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3452–3459 (2013)
Xiong, X., De la Torre, F.: Supervised descent method and its applications to face alignment. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 532–539 (2013)
Yang, Y., Ma, Z., Hauptmann, A.G., Sebe, N.: Feature selection for multimedia analysis by sharing information among multiple tasks. IEEE Trans. Multimedia 15(3), 661–669 (2013)
Yu, X., Huang, J., Zhang, S., Yan, W., Metaxas, D.N.: Pose-free facial landmark fitting via optimized part mixtures and cascaded deformable shape model. In: 2013 IEEE International Conference on Computer Vision, pp. 1944–1951 (2013)
Yuan, X.T., Liu, X., Yan, S.: Visual classification with multitask joint sparse representation. IEEE Trans. Image Process. 21(10), 4349–4360 (2012)
Zhang, J., Ghahramani, Z., Wasserman, L., et al.: A probabilistic framework for multitask learning. School Comput. Sci., Carnegie Mellon University, Pittsburgh, PA, Tech. Rep. CMU-LTI-06-006 (2006)
Zhang, J., Kan, M., Shan, S., Chen, X.: Occlusion-free face alignment: deep regression networks coupled with de-corrupt autoencoders. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Zhang, J., Shan, S., Kan, M., Chen, X.: Coarse-to-fine auto-encoder networks (cfan) for real-time face alignment. In: European Conference on Computer Vision, pp. 1–16. Springer (2014)
Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Facial landmark detection by deep multi-task learning. In: European Conference on Computer Vision, pp. 94–108. Springer (2014)
Zhou, J., Chen, J., Ye, J.: Multi-task learning: theory, algorithms, and applications. https://www.siam.org/meetings/sdm12/zhou_chen_ye.pdf (2012)
Zhu, S., Li, C., Change Loy, C., Tang, X.: Face alignment by coarse-to-fine shape searching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4998–5006 (2015)
Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2879–2886 (2012)
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported in part by the National Natural Science Foundation of China (Nos. 51505004, 61403024, 61502026) and in part by the Beijing Natural Science Foundation (No. 4163075).
Rights and permissions
About this article
Cite this article
Bian, P., Xie, Z. & Jin, Y. Multi-task feature learning-based improved supervised descent method for facial landmark detection. SIViP 12, 17–24 (2018). https://doi.org/10.1007/s11760-017-1125-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-017-1125-4