Skip to main content
Log in

Attention-based convolutional neural network for deep face recognition

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Discriminative feature embedding is of essential importance in the field of large scale face recognition. In this paper, we propose an attention-based convolutional neural network (ACNN) for discriminative face feature embedding, which aims to decrease the information redundancy among channels and focus on the most informative components of spatial feature maps. More specifically, the proposed attention module consists of a channel attention block and a spatial attention block which adaptively aggregate the feature maps in both channel and spatial domains to learn the inter-channel relationship matrix and the inter-spatial relationship matrix, then matrix multiplications are conducted for a refined and robust face feature. With the attention module we proposed, we can make standard convolutional neural networks (CNNs), such as ResNet-50, ResNet-101 have more discriminative power for deep face recognition. The experiments on Labelled Faces in the Wild (LFW), Age Database (AgeDB), Celebrities in Frontal Profile (CFP) and MegaFace Challenge 1 (MF1) show that our proposed ACNN architecture consistently outperforms naive CNNs and achieves the state-of-the-art performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Buades A, Coll B, Morel J -M (2005) A non-local algorithm for image denoising. In: 2005 IEEE Computer society conference on computer vision and pattern recognition (CVPR’05). IEEE, vol 2, pp 60–65

  2. Cao Q, Li S, Xie W, Parkhi OM, Zisserman A (2018) Vggface2: A dataset for recognising faces across pose and age. In: 2018 13Th IEEE international conference on automatic face & gesture recognition (FG 2018). IEEE, pp 67–74

  3. Chen D, Cao X, Wen F, Sun J (2013) Blessing of dimensionality High-dimensional feature and its efficient compression for face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3025–3032

  4. Chen L, Zhang H, Xiao J, Nie Lg, Shao J, Liu W, Chua T-S (2017) Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5659–5667

  5. Cheng E-J, Chou K-P, Rajora S, Jin B-H, Tanveer M, Lin C-T, Young K-Y, Lin W-C, Prasad M (2019) Deep sparse representation classifier for facial recognition and detection system. Pattern Recogn Lett 125:71–77

    Article  Google Scholar 

  6. Cui C, Liu H, Lian T, Nie L, Zhu L, Yin Y (2018) Distribution-oriented aesthetics assessment with semantic-aware hybrid network. IEEE Trans Multimed 21 (5):1209–1220

    Article  Google Scholar 

  7. Deng J, Zhou Y, Zafeiriou S (2017) Marginal loss for deep face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 60–68

  8. Deng J, Guo J, Xue N, Zafeiriou S (2018) Arcface: Additive angular margin loss for deep face recognition. arXiv:1801.07698

  9. Feng W, Jian C, Liu W, Liu H (2018) Additive margin softmax for face verification. IEEE Signal Process Lett PP(99):1–1

    Google Scholar 

  10. Fu J, Liu J, Tian H, Fang Z, Lu H (2018) Dual attention network for scene segmentation. arXiv:1809.02983

  11. Gao Y, Ma J, Zhao M, Liu W, Yuille AL (2019) Nddr-cnn: Layerwise feature fusing in multi-task cnns by neural discriminative dimensionality reduction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3205–3214

  12. Guo Y, Zhang L, Hu Y, He X, Gao J (2016) Ms-celeb-1m: A dataset and benchmark for large-scale face recognition. In: European conference on computer vision. Springer, pp 87–102

  13. He X, Yan S, Hu Y, Niyogi P, Zhang H-J (2005) Face recognition using laplacianfaces. IEEE Trans Pattern Anal Mach Intell 27(3):328–340

    Article  Google Scholar 

  14. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE CVPR, pp 770–778

  15. Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. Comput Sci 3(4):212–223

    Google Scholar 

  16. Hu J, Shen L, Sun G (2017) Squeeze-and-excitation networks, pp 7. arXiv:1709.01507

  17. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700– 4708

  18. Huang GB, Learned-Miller E (2014) Labeled faces in the wild Updates and new reporting procedures. Dept. Comput. Sci., Univ. Massachusetts Amherst, Amherst, MA, Tech. Report, pp 14–003

  19. Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167

  20. Jian M, Lam KM, Dong J, Shen L (2015) Visual-patch-attention-aware saliency detection. IEEE Trans Cybern 45(8):1575

    Article  Google Scholar 

  21. Kemelmacher-Shlizerman I, Seitz SM, Miller D, Brossard E (2016) The megaface benchmark: 1 million faces for recognition at scale. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4873–4882

  22. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: International conference on neural information processing systems, pp 1097– 1105

  23. Kuen J, Wang Z, Wang G (2016) Recurrent attentional networks for saliency detection. In: Computer vision and pattern recognition, pp 3668–3677

  24. Lei J, Zhang B, Ling H (2019) Deep learning face representation by fixed erasing in facial landmarks. Multimed Tools Appl 78:1–16

    Article  Google Scholar 

  25. Ling H, Wang Z, Li P, Shi Y, Chen J, Zou F (2019) Improving person re-identification by multi-task learning. Neurocomputing 347:109–118

    Article  Google Scholar 

  26. Ling H, Wu J, Wu L, Huang J, Chen J, Li P (2019) Self residual attention network for deep face recognition. IEEE Access 7:55159–55168

    Article  Google Scholar 

  27. Liu J, Deng Y, Bai T, Wei Z, Huang C (2015) Targeting ultimate accuracy: Face recognition via deep embedding. arXiv:1506.07310

  28. Liu W, Wen Y, Yu Z, Li M, Raj B, Le S (2017) Sphereface Deep hypersphere embedding for face recognition. In: The IEEE conference on CVPR, vol 1, pp 1

  29. Liu W, Lin R, Liu Z, Liu L, Yu Z, Bo D, Le S (2018) Learning towards minimum hyperspherical energy. In: Advances in neural information processing systems, pp 6225–6236

  30. Moschoglou S, Papaioannou A, Sagonas C, Deng J, Kotsia I, Zafeiriou S (2017) Agedb: the first manually collected, in-the-wild age database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 51–59

  31. Ng H-W, Winkler S (2014) A data-driven approach to cleaning large face datasets. In: 2014 IEEE International conference on image processing (ICIP). IEEE, pp 343–347

  32. Parkhi OM, Vedaldi A, Zisserman A, et al. (2015) Deep face recognition. In: BMVC, vol 1, pp 6

  33. Rao Y, Lu J, Zhou J (2017) Attention-aware deep reinforcement learning for video face recognition. In: Proceedings of the IEEE Conference on CVPR, pp 3931–3940

  34. Santurkar S, Tsipras D, Ilyas A, Madry A (2018) How does batch normalization help optimization?(no, it is not about internal covariate shift). arXiv:1805.11604

  35. Schroff F, Kalenichenko D, Philbin J (2015) Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on CVPR, pp 815–823

  36. Sengupta S, Chen J-C, Castillo C, Patel VM, Chellappa R, Jacobs DW (2016) Frontal to profile face verification in the wild. In: 2016 IEEE Winter conference on applications of computer vision (WACV). IEEE, pp 1–9

  37. Sun Y, Chen Y, Wang X, Tang X (2014) Deep learning face representation by joint identification-verification. In: Advances in neural information processing systems, pp 1988–1996

  38. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S (2015) Anguelov.etc Going deeper with convolutions. In: Proceedings of the IEEE conference on CVPR, pp 1–9

  39. Taigman Y, Yang M, Marc’Aurelio R, Wolf L (2014) Deepface: Closing the gap to human-level performance in face verification. In: Proceedings of the IEEE conference on CVPR, pp 1701–1708

  40. Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. arXiv:1704.06904

  41. Wang H, Wang Y, Zhou Z, Ji X, Gong D, Zhou J, Li Z, Liu W (2018) Cosface: Large margin cosine loss for deep face recognition. In: Proceedings of the IEEE Conference on CVPR, pp 5265–5274

  42. Wang L, Qian X, Zhang Y, Shen J, Cao X (2019) Enhancing sketch-based image retrieval by cnn semantic re-ranking. IEEE transactions on cybernetics

  43. Wang X, Wang S, Zhang S, Fu T, Shi H, Mei T (2018) Support vector guided softmax loss for face recognition. arXiv:1812.11317

  44. Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7794–7803

  45. Wei Z, Si L, Sun Y, Ling H (2019) Accurate facial image parsing at real-time speed. IEEE Transactions on Image Processing

  46. Wen Y, Zhang K, Li Z, Yu Q (2016) A discriminative feature learning approach for deep face recognition. In: European conference on computer vision. Springer, pp 499–515

  47. Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam Convolutional block attention module. In: European conference on computer vision. Springer, pp 3–19

  48. Wright J, Yang AY, Ganesh A, Shankar Sastry S, Ma Y (2008) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31 (2):210–227

    Article  Google Scholar 

  49. Wu L, Ling H, Li P, Chen J, Fang Y, Zhou F (2019) Deep supervised hashing based on stable distribution. IEEE Access 7:36489–36499

    Article  Google Scholar 

  50. Xie L, Shen J, Han J, Zhu L, Shao L (2017) Dynamic multi-view hashing for online image retrieval. IJCAI

  51. Yang J, Ren P, Zhang D, Chen D, Wen F, Li H, Hua G (2017) Neural aggregation network for video face recognition. In: CVPR, vol 4, pp 7

  52. Yi D, Lei Z, Liao S, Li SZ (2014) Learning face representation from scratch. Computer Science

  53. Zhang H, Goodfellow I, Metaxas D, Odena A (2018) Self-attention generative adversarial networks. arXiv:1805.08318

  54. Zhang K, Zhang Z, Li Z, Yu Q (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503

    Article  Google Scholar 

  55. Zhang X, Fang Z, Wen Y, Li Z, Yu Q (2017) Range loss for deep face recognition with long-tailed training data. In: Proceedings of the IEEE International Conference on Computer Vision, pp 5409–5418

  56. Zhang X, Gao Y (2009) Face recognition across pose: A review. Pattern Recogn 42(11):2876–2896

    Article  Google Scholar 

  57. Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2929

Download references

Acknowledgements

This work was supported in part by the Natural Science Foundation of China under Grant U1536203 and 61972169, in part by the National key research and development program of China (2016QY01W0200), in part by the Major Scientific and Technological Project of Hubei Province (2018AAA068 and 2019AAA051).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hefei Ling.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ling, H., Wu, J., Huang, J. et al. Attention-based convolutional neural network for deep face recognition. Multimed Tools Appl 79, 5595–5616 (2020). https://doi.org/10.1007/s11042-019-08422-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-08422-2

Keywords

Navigation