Skip to main content
Log in

DC-EDN: densely connected encoder-decoder network with reinforced depthwise convolution for face alignment

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

High accuracy and fast face alignment algorithms play an important role in many face-related applications. Generally, the model speed is inversely related to the number of parameters. We construct our network based on densely connected encoder-decoders, which is an efficient method to balance the parameter number and localization results. In each encoder-decoder, we introduce stacking depthwise convolution and depthwise feature fusion within the same channel, which greatly improves the performance of depthwise convolution and reduces the number of model parameters. In addition, we enhance the mean square loss function by assigning different penalty weights to each coordinate according to the distance to the position corresponding to the maximum value in the label heatmap. Experiments show that the model with the improved loss function obtains better localization results. In the experiment, we compare our method to state-of-the-art methods based on 300W and WFLW. The localization error is 2.76% with the common subset of 300W and the model size (0.7M) is small and even utilizes approximately 1% of the number of parameters of the other models. The dataset and model based on WFLW are publicly available at https://github.com/iam-zhanghongliang/DC-EDN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Alexanderson S, Beskow J (2014) Animated lombard speech: Motion capture, facial animation and visual intelligibility of speech produced in adverse conditions. Comput Speech Lang 28(2):607–618

  2. Alp Guler R, Trigeorgis G, Antonakos E, Snape P, Zafeiriou S, Kokkinos I (2017) Densereg: Fully convolutional dense shape regression in-the-wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6799–6808

  3. Belhumeur P N, Jacobs D W, Kriegman D J, Kumar N (2013) Localizing parts of faces using a consensus of exemplars. IEEE Trans Pattern Anal Mach Intell 35(12):2930–2940

    Article  Google Scholar 

  4. Cao X, Wei Y, Wen F, Sun J (2014) Face alignment by explicit shape regression. Int J Comput Vis 107(2):177–190

    Article  MathSciNet  Google Scholar 

  5. Chollet F (2017) Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1251–1258

  6. Dapogny A, Bailly K, Cord M (2019) Decafa: Deep convolutional cascade for face alignment in the wild. In: Proceedings of the IEEE International Conference on Computer Vision, pp 6893–6901

  7. Deng J, Trigeorgis G, Zhou Y, Zafeiriou S (2019) Joint multi-view face alignment in the wild. IEEE Trans Image Process 28(7):3636–3648

    Article  MathSciNet  Google Scholar 

  8. Dong X, Yan Y, Ouyang W, Yang Y (2018) Style aggregated network for facial landmark detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 379–388

  9. Feng Y, Wu F, Shao X, Wang Y, Zhou X (2018) Joint 3d face reconstruction and dense alignment with position map regression network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 534–551

  10. Howard A G, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint

  11. Jégou S, Drozdzal M, Vazquez D, Romero A, Bengio Y (2017) The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 11–19

  12. Kazemi V, Sullivan J (2014) One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1867–1874

  13. Kemelmacher-Shlizerman I, Seitz SM, Miller D, Brossard E (2016) The megaface benchmark: 1 million faces for recognition at scale. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4873–4882

  14. Kowalski M, Naruniec J, Trzcinski T (2017) Deep alignment network: A convolutional neural network for robust face alignment. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 88–97

  15. Kumar A, Chellappa R (2018) Disentangling 3d pose in a dendritic cnn for unconstrained 2d face alignment. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 430–439

  16. Le V, Brandt J, Lin Z, Bourdev L, Huang TS (2012) Interactive facial feature localization. In: European conference on computer vision, Springer, pp 679–692

  17. Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z et al (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4681–4690

  18. Liu Q, Deng J, Yang J, Liu G, Tao D (2017) Adaptive cascade regression model for robust face alignment. IEEE Trans Image Process 26(2):797–807

    Article  MathSciNet  Google Scholar 

  19. Park H, Kim D (2020) A complementary regression network for accurate face alignment. Image Vis Comput 95:103,883

  20. Ren S, Cao X, Wei Y, Jian S (2014) Face alignment at 3000 fps via regressing local binary features. In: Computer Vision and Pattern Recognition, pp 1685–1692

  21. Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention, Springer, pp 234–241

  22. Sagonas C, Tzimiropoulos G, Zafeiriou S, Pantic M (2013) 300 faces in-the-wild challenge: The first facial landmark localization challenge. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp 397–403

  23. Sariyanidi E, Gunes H, Cavallaro A (2015) Automatic analysis of facial affect: a survey of registration, representation, and recognition. IEEE Trans Pattern Anal Mach Intell 37(6):1113–1133

    Article  Google Scholar 

  24. Shi B, Bai X, Liu W, Wang J (2018) Face alignment with deep regression. IEEE Trans Neural Netw Learn Syst 29(1):183– 194

    Article  MathSciNet  Google Scholar 

  25. Sun Y, Wang X, Tang X (2013) Hybrid deep learning for face verification. In: IEEE International Conference on Computer Vision, pp 1997–2009

  26. Sun Y, Chen Y, Wang X, Tang X (2014) Deep learning face representation by joint identification-verification. In: Advances in neural information processing systems, pp 1988–1996

  27. Tang Z, Peng X, Geng S, Wu L, Zhang S, Metaxas D (2018) Quantized densely connected u-nets for efficient landmark localization. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 339–354

  28. Trigeorgis G, Snape P, Nicolaou MA, Antonakos E, Zafeiriou S (2016) Mnemonic descent method: A recurrent process applied for end-to-end face alignment. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4177–4187

  29. Valle R, Buenaposada JM, Valdes A, Baumela L (2018) A deeply-initialized coarse-to-fine ensemble of regression trees for face alignment. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 585–601

  30. Wan J, Li J, Lai Z, Du B, Zhang L (2020) Robust face alignment by cascaded regression and de-occlusion. Neural Netw:261–272

  31. Wang T, Tong X, Cai W (2020) Attention-based face alignment: a solution to speed/accuracy trade-off. Neurocomputing:–96

  32. Wu W, Yang S (2017) Leveraging intra and inter-dataset variations for robust face alignment. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 150–159

  33. Wu W, Qian C, Yang S, Wang Q, Cai Y, Zhou Q (2018) Look at boundary: A boundary-aware face alignment algorithm. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2129–2138

  34. Xiong X, De la Torre F (2013) Supervised descent method and its applications to face alignment. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 532–539

  35. Yang H, Patras I (2013) Sieving regression forest votes for facial feature detection in the wild. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1936–1943

  36. Yang J, Liu Q, Zhang K (2017) Stacked hourglass network for robust facial landmark localisation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 79–87

  37. Zhang Z, Luo P, Loy C C, Tang X (2016) Learning deep representation for face alignment with auxiliary attributes. IEEE Trans Pattern Anal Mach Intell 38(5):918–930

    Article  Google Scholar 

  38. Zhao Y, Liu Y, Shen C, Gao Y, Xiong S (2020) Mobilefan: Transferring deep hidden representation for face alignment. Pattern Recogn 100:107–114

    Google Scholar 

  39. Zhu S, Li C, Change Loy C, Tang X (2015) Face alignment by coarse-to-fine shape searching. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4998–5006

  40. Zhu X, Ramanan D (2012) Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, pp 2879–2886

  41. Zhu Z, Ping L, Wang X, Tang X (2013) Deep learning identity-preserving face space. In: IEEE International Conference on Computer Vision, pp 113–120

Download references

Acknowledgments

This work is supported by the Fundamental Research Funds for the Central Universities (Grant No. N160504007) and the National Natural Science Foundation of China (Grant No.31301086)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiangde Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, L., Zhang, H., Wei, P. et al. DC-EDN: densely connected encoder-decoder network with reinforced depthwise convolution for face alignment. Appl Intell 51, 5025–5039 (2021). https://doi.org/10.1007/s10489-020-01940-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-01940-9

Keywords

Navigation