Skip to main content

Multi-Attention Network for 2D Face Alignment in the Wild

  • Conference paper
  • First Online:
  • 1405 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1043))

Abstract

Most existing algorithms based on Convolutional Neural Networks (CNNs) for face alignment ignore the significance of attention mechanism. In this paper, we propose a Multi-Attention Network (MANet) for robust face alignment. Our attention mechanism includes multi-level feature attention and multi-scale attention. Multi-level feature attention is introduced for the purpose of paying attention to features of different levels, specifically, high-level feature attentions are essential for correlations among neighboring regions whereas low-level feature attentions focus on detailed description for local parts. While multi-scale attention is designed to obtain better representation the features of different scales. The attentions mentioned above are utilized for better feature presentation and information flow, thus our network is guided to emphasize the key information and suppress the less significant information. The experimental results on 300 W and WFLW datasets demonstrate the superiority of the proposed method over the state-of-the-art approaches.

This work was supported in part by the National Nature Science Foundation of China under Grant 61372137 and in part by the Natural Science Foundation of Anhui Province, China, under Grant 1908085MF209.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Feng, Z., Kittler, J., Awais, M., et al.: Wing loss for robust facial landmark localisation with convolutional neural networks. In: Computer Vision and Pattern Recognition, pp. 2235–2245 (2018)

    Google Scholar 

  2. Wu, W., Qian, C., Yang, S., et al.: Look at boundary: a boundary-aware face alignment algorithm. In: Computer Vision and Pattern Recognition, pp. 2129–2138 (2018)

    Google Scholar 

  3. Li, S., Deng, W., Du, J., et al.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: Computer Vision and Pattern Recognition, pp. 2584–2593 (2017)

    Google Scholar 

  4. Pantic, M., Rothkrantz, L.J.: Automatic analysis of facial expressions: the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 22(12), 1424–1445 (2000)

    Article  Google Scholar 

  5. Trigeorgis, G., Snape, P., Nicolaou, M.A., et al.: Mnemonic descent method: a recurrent process applied for end-to-end face alignment. In: Computer Vision and Pattern Recognition, pp. 4177–4187 (2016)

    Google Scholar 

  6. Newell, A., Yang, K., Deng, J., et al.: Stacked hourglass networks for human pose estimation. In: European Conference on Computer Vision, pp. 483–499 (2016)

    Chapter  Google Scholar 

  7. Yang, J., Liu, Q., Zhang, K., et al.: Stacked hourglass network for robust facial landmark localisation. In: Computer Vision and Pattern Recognition, pp. 2025–2033 (2017)

    Google Scholar 

  8. Bulat, A., Tzimiropoulos, G.: How far are we from solving the 2D & 3D face alignment problem? (and a dataset of 230,000 3D facial landmarks). In: International Conference on Computer Vision, pp. 1021–1030 (2017)

    Google Scholar 

  9. Lv, J., Shao, X., Xing, J., et al.: A deep regression architecture with two-stage re-initialization for high performance facial landmark detection. In: Computer Vision and Pattern Recognition, pp. 3691–3700 (2017)

    Google Scholar 

  10. Valle, R, Buenaposada, J.M., Valdes, A., et al.: A deeply-initialized coarse-to-fine ensemble of regression trees for face alignment. In: European Conference on Computer Vision, pp. 609–624 (2018)

    Chapter  Google Scholar 

  11. Salisbury, D.F.: Cognitive psychology and its implications for designing drill and practice programs for computers. J. Comput. Based Instr. 17(1), 23–30 (1990)

    Google Scholar 

  12. He, X., Peng, Y.: Multi-attention guided activation propagation in CNNs. In: Lai, J.-H., et al. (eds.) PRCV 2018. LNCS, vol. 11257, pp. 16–27. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03335-4_2

    Chapter  Google Scholar 

  13. Fu, J., Liu, J., Tian, H., et al.: Dual attention network for scene segmentation. arXiv preprint arXiv:1809.02983 (2018)

  14. Chu, X., Yang, W., Ouyang, W., et al.: Multi-context attention for human pose estimation. In: Computer Vision and Pattern Recognition, pp. 5669–5678 (2017)

    Google Scholar 

  15. Huang, G., Liu, Z., Van Der Maaten, L., et al.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)

    Google Scholar 

  16. Tang, Z., Peng, X., Geng, S., et al.: CU-net: coupled U-nets. arXiv preprint arXiv:1808.06521 (2018)

  17. Mehta, S., Rastegari, M., Caspi, A., et al.: ESPNet: efficient spatial pyramid of dilated convolutions for semantic segmentation. In: European Conference on Computer Vision, pp. 561–580 (2018)

    Chapter  Google Scholar 

  18. Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., et al.: 300 faces in-the-wild challenge: the first facial landmark localization challenge. In: International Conference on Computer Vision, pp. 397–403 (2013)

    Google Scholar 

  19. Cao, X., Wei, Y., Wen, F., et al.: Face alignment by explicit shape regression. Int. J. Comput. Vis. 107(2), 177–190 (2014)

    Article  MathSciNet  Google Scholar 

  20. Xiong, X., La Torre, F.D.: Supervised descent method and its applications to face alignment. In: Computer Vision and Pattern Recognition, pp. 532–539 (2013)

    Google Scholar 

  21. Zhu, S., Li, C., Loy, C.C., et al.: Face alignment by coarse-to-fine shape searching. In: Computer Vision and Pattern Recognition, pp. 4998–5006 (2015)

    Google Scholar 

  22. Wu, W., Yang, S.: Leveraging intra and inter-dataset variations for robust face alignment. In: Computer Vision and Pattern Recognition, pp. 2096–2105 (2017)

    Google Scholar 

  23. Burgosartizzu, X.P., Perona, P., Dollar, P., et al.: Robust face landmark estimation under occlusion. In: International Conference on Computer Vision, pp. 1513–1520 (2013)

    Google Scholar 

  24. Zhang, J., Shan, S., Kan, M., et al.: Coarse-to-fine auto-encoder networks (CFAN) for real-time face alignment. In: European Conference on Computer Vision, pp. 1–16 (2014)

    Google Scholar 

  25. Ren, S., Cao, X., Wei, Y., et al.: Face alignment at 3000 FPS via regressing local binary features. In: Computer Vision and Pattern Recognition, pp. 1685–1692 (2014)

    Google Scholar 

  26. Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Facial landmark detection by deep multi-task learning. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 94–108. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_7

    Chapter  Google Scholar 

  27. Xiao, S., Feng, J., Xing, J., Lai, H., Yan, S., Kassim, A.: Robust facial landmark detection via recurrent attentive-refinement networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 57–72. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_4

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huabin Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, X., Wang, H., Cheng, R., Yan, X., Tao, L. (2019). Multi-Attention Network for 2D Face Alignment in the Wild. In: Wang, Y., Huang, Q., Peng, Y. (eds) Image and Graphics Technologies and Applications. IGTA 2019. Communications in Computer and Information Science, vol 1043. Springer, Singapore. https://doi.org/10.1007/978-981-13-9917-6_24

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-9917-6_24

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-9916-9

  • Online ISBN: 978-981-13-9917-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics