Abstract
Discriminative feature embedding is of essential importance in the field of large scale face recognition. In this paper, we propose an attention-based convolutional neural network (ACNN) for discriminative face feature embedding, which aims to decrease the information redundancy among channels and focus on the most informative components of spatial feature maps. More specifically, the proposed attention module consists of a channel attention block and a spatial attention block which adaptively aggregate the feature maps in both channel and spatial domains to learn the inter-channel relationship matrix and the inter-spatial relationship matrix, then matrix multiplications are conducted for a refined and robust face feature. With the attention module we proposed, we can make standard convolutional neural networks (CNNs), such as ResNet-50, ResNet-101 have more discriminative power for deep face recognition. The experiments on Labelled Faces in the Wild (LFW), Age Database (AgeDB), Celebrities in Frontal Profile (CFP) and MegaFace Challenge 1 (MF1) show that our proposed ACNN architecture consistently outperforms naive CNNs and achieves the state-of-the-art performance.
Similar content being viewed by others
References
Buades A, Coll B, Morel J -M (2005) A non-local algorithm for image denoising. In: 2005 IEEE Computer society conference on computer vision and pattern recognition (CVPR’05). IEEE, vol 2, pp 60–65
Cao Q, Li S, Xie W, Parkhi OM, Zisserman A (2018) Vggface2: A dataset for recognising faces across pose and age. In: 2018 13Th IEEE international conference on automatic face & gesture recognition (FG 2018). IEEE, pp 67–74
Chen D, Cao X, Wen F, Sun J (2013) Blessing of dimensionality High-dimensional feature and its efficient compression for face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3025–3032
Chen L, Zhang H, Xiao J, Nie Lg, Shao J, Liu W, Chua T-S (2017) Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5659–5667
Cheng E-J, Chou K-P, Rajora S, Jin B-H, Tanveer M, Lin C-T, Young K-Y, Lin W-C, Prasad M (2019) Deep sparse representation classifier for facial recognition and detection system. Pattern Recogn Lett 125:71–77
Cui C, Liu H, Lian T, Nie L, Zhu L, Yin Y (2018) Distribution-oriented aesthetics assessment with semantic-aware hybrid network. IEEE Trans Multimed 21 (5):1209–1220
Deng J, Zhou Y, Zafeiriou S (2017) Marginal loss for deep face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 60–68
Deng J, Guo J, Xue N, Zafeiriou S (2018) Arcface: Additive angular margin loss for deep face recognition. arXiv:1801.07698
Feng W, Jian C, Liu W, Liu H (2018) Additive margin softmax for face verification. IEEE Signal Process Lett PP(99):1–1
Fu J, Liu J, Tian H, Fang Z, Lu H (2018) Dual attention network for scene segmentation. arXiv:1809.02983
Gao Y, Ma J, Zhao M, Liu W, Yuille AL (2019) Nddr-cnn: Layerwise feature fusing in multi-task cnns by neural discriminative dimensionality reduction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3205–3214
Guo Y, Zhang L, Hu Y, He X, Gao J (2016) Ms-celeb-1m: A dataset and benchmark for large-scale face recognition. In: European conference on computer vision. Springer, pp 87–102
He X, Yan S, Hu Y, Niyogi P, Zhang H-J (2005) Face recognition using laplacianfaces. IEEE Trans Pattern Anal Mach Intell 27(3):328–340
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE CVPR, pp 770–778
Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. Comput Sci 3(4):212–223
Hu J, Shen L, Sun G (2017) Squeeze-and-excitation networks, pp 7. arXiv:1709.01507
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700– 4708
Huang GB, Learned-Miller E (2014) Labeled faces in the wild Updates and new reporting procedures. Dept. Comput. Sci., Univ. Massachusetts Amherst, Amherst, MA, Tech. Report, pp 14–003
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167
Jian M, Lam KM, Dong J, Shen L (2015) Visual-patch-attention-aware saliency detection. IEEE Trans Cybern 45(8):1575
Kemelmacher-Shlizerman I, Seitz SM, Miller D, Brossard E (2016) The megaface benchmark: 1 million faces for recognition at scale. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4873–4882
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: International conference on neural information processing systems, pp 1097– 1105
Kuen J, Wang Z, Wang G (2016) Recurrent attentional networks for saliency detection. In: Computer vision and pattern recognition, pp 3668–3677
Lei J, Zhang B, Ling H (2019) Deep learning face representation by fixed erasing in facial landmarks. Multimed Tools Appl 78:1–16
Ling H, Wang Z, Li P, Shi Y, Chen J, Zou F (2019) Improving person re-identification by multi-task learning. Neurocomputing 347:109–118
Ling H, Wu J, Wu L, Huang J, Chen J, Li P (2019) Self residual attention network for deep face recognition. IEEE Access 7:55159–55168
Liu J, Deng Y, Bai T, Wei Z, Huang C (2015) Targeting ultimate accuracy: Face recognition via deep embedding. arXiv:1506.07310
Liu W, Wen Y, Yu Z, Li M, Raj B, Le S (2017) Sphereface Deep hypersphere embedding for face recognition. In: The IEEE conference on CVPR, vol 1, pp 1
Liu W, Lin R, Liu Z, Liu L, Yu Z, Bo D, Le S (2018) Learning towards minimum hyperspherical energy. In: Advances in neural information processing systems, pp 6225–6236
Moschoglou S, Papaioannou A, Sagonas C, Deng J, Kotsia I, Zafeiriou S (2017) Agedb: the first manually collected, in-the-wild age database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 51–59
Ng H-W, Winkler S (2014) A data-driven approach to cleaning large face datasets. In: 2014 IEEE International conference on image processing (ICIP). IEEE, pp 343–347
Parkhi OM, Vedaldi A, Zisserman A, et al. (2015) Deep face recognition. In: BMVC, vol 1, pp 6
Rao Y, Lu J, Zhou J (2017) Attention-aware deep reinforcement learning for video face recognition. In: Proceedings of the IEEE Conference on CVPR, pp 3931–3940
Santurkar S, Tsipras D, Ilyas A, Madry A (2018) How does batch normalization help optimization?(no, it is not about internal covariate shift). arXiv:1805.11604
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on CVPR, pp 815–823
Sengupta S, Chen J-C, Castillo C, Patel VM, Chellappa R, Jacobs DW (2016) Frontal to profile face verification in the wild. In: 2016 IEEE Winter conference on applications of computer vision (WACV). IEEE, pp 1–9
Sun Y, Chen Y, Wang X, Tang X (2014) Deep learning face representation by joint identification-verification. In: Advances in neural information processing systems, pp 1988–1996
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S (2015) Anguelov.etc Going deeper with convolutions. In: Proceedings of the IEEE conference on CVPR, pp 1–9
Taigman Y, Yang M, Marc’Aurelio R, Wolf L (2014) Deepface: Closing the gap to human-level performance in face verification. In: Proceedings of the IEEE conference on CVPR, pp 1701–1708
Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. arXiv:1704.06904
Wang H, Wang Y, Zhou Z, Ji X, Gong D, Zhou J, Li Z, Liu W (2018) Cosface: Large margin cosine loss for deep face recognition. In: Proceedings of the IEEE Conference on CVPR, pp 5265–5274
Wang L, Qian X, Zhang Y, Shen J, Cao X (2019) Enhancing sketch-based image retrieval by cnn semantic re-ranking. IEEE transactions on cybernetics
Wang X, Wang S, Zhang S, Fu T, Shi H, Mei T (2018) Support vector guided softmax loss for face recognition. arXiv:1812.11317
Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7794–7803
Wei Z, Si L, Sun Y, Ling H (2019) Accurate facial image parsing at real-time speed. IEEE Transactions on Image Processing
Wen Y, Zhang K, Li Z, Yu Q (2016) A discriminative feature learning approach for deep face recognition. In: European conference on computer vision. Springer, pp 499–515
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam Convolutional block attention module. In: European conference on computer vision. Springer, pp 3–19
Wright J, Yang AY, Ganesh A, Shankar Sastry S, Ma Y (2008) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31 (2):210–227
Wu L, Ling H, Li P, Chen J, Fang Y, Zhou F (2019) Deep supervised hashing based on stable distribution. IEEE Access 7:36489–36499
Xie L, Shen J, Han J, Zhu L, Shao L (2017) Dynamic multi-view hashing for online image retrieval. IJCAI
Yang J, Ren P, Zhang D, Chen D, Wen F, Li H, Hua G (2017) Neural aggregation network for video face recognition. In: CVPR, vol 4, pp 7
Yi D, Lei Z, Liao S, Li SZ (2014) Learning face representation from scratch. Computer Science
Zhang H, Goodfellow I, Metaxas D, Odena A (2018) Self-attention generative adversarial networks. arXiv:1805.08318
Zhang K, Zhang Z, Li Z, Yu Q (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503
Zhang X, Fang Z, Wen Y, Li Z, Yu Q (2017) Range loss for deep face recognition with long-tailed training data. In: Proceedings of the IEEE International Conference on Computer Vision, pp 5409–5418
Zhang X, Gao Y (2009) Face recognition across pose: A review. Pattern Recogn 42(11):2876–2896
Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2929
Acknowledgements
This work was supported in part by the Natural Science Foundation of China under Grant U1536203 and 61972169, in part by the National key research and development program of China (2016QY01W0200), in part by the Major Scientific and Technological Project of Hubei Province (2018AAA068 and 2019AAA051).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ling, H., Wu, J., Huang, J. et al. Attention-based convolutional neural network for deep face recognition. Multimed Tools Appl 79, 5595–5616 (2020). https://doi.org/10.1007/s11042-019-08422-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-08422-2