Abstract
Research in face recognition has achieved new heights after the advent of deep learning, particularly 2D Convolution Neural Network (2D CNN). However, to solve the real-world challenges in face recognition, researchers have started using 3D CNN with video as input. The use of video adds temporal dimension to the input, in addition to the spatial dimension. However, the features extracted by 3D CNN are not as discriminative as those of 2D CNN. In this paper, we propose a framework called \(HL_{3}CNN\) framework. The \(HL_{3}CNN\) framework uses the Hardmining loss with 3D CNN to increase the discriminative ability of the 3D CNN features. The Hardmining loss samples hard examples in the training dataset to increase the performance of the 3D CNN model. In our experiments, the increased accuracy of the \(HL_{3}CNN\) framework to 99.23% on an in-house collected CVBL dataset confirms that the Hardmining loss enhances the discriminative ability of a basic loss function for 3D CNN features.
Similar content being viewed by others
References
Ding C, Tao D. Trunk-branch ensemble convolutional neural networks for video-based face recognition. IEEE Trans Pattern Anal Machine Intell. 2017;40(4):1002–14.
Guo G, Wen L, Yan S. Face authentication with makeup changes. IEEE Trans Circ Syst Video Technol. 2013;24(5):814–25.
Parchami M, Bashbaghi S, Granger E, Sayed S. Using deep autoencoders to learn robust domain-invariant representations for still-to-video face recognition. In: 2017 14th IEEE International Conference on advanced video and signal based surveillance (AVSS). IEEE; 2017. p. 1–6.
Yang J, Ren P, Zhang D, Chen D, Wen F, Li H, Hua G. Neural aggregation network for video face recognition. In: Proceedings of the IEEE Conference on computer vision and pattern recognition; 2017. p. 4362–71.
Kim M, Kumar S, Pavlovic V, Rowley H. Face tracking and recognition with visual constraints in real-world videos. In: 2008 IEEE Conference on computer vision and pattern recognition; 2008. p. 1–8.
Rao Y, Lin J, Lu J, Zhou J. Learning discriminative aggregation network for video-based face recognition. In: Proceedings of the IEEE International conference on computer vision; 2017. p. 3781–90.
Scheenstra A, Ruifrok A, Veltkamp RC. A survey of 3d face recognition methods. In: International Conference on audio-and video-based biometric person authentication. Springer; 2005. p. 891–9.
Rao Y, Lu J, Zhou J. Attention-aware deep reinforcement learning for video face recognition. In: Proceedings of the IEEE International Conference on computer vision; 2017. p. 3931–40.
Sohn K, Liu S, Zhong G, Yu X, Yang M-H, Chandraker M. Unsupervised domain adaptation for face recognition in unlabeled videos. In: Proceedings of the IEEE International Conference on computer vision; 2017. p. 3210–218.
Zhao J, Han J, Shao L. Unconstrained face recognition using a set-to-set distance measure on deep learned features. IEEE Trans Circ Syst Video Technol. 2017;28(10):2679–89.
Mishra NK, Singh SK. Face recognition using 3d cnns. arXiv preprint arXiv:2102.01441 (2021)
Mishra N, Singh S. Face recognition using Sf3CNN with higher feature discrimination. 2021; pp. 524–29 . https://doi.org/10.1007/978-981-16-1092-9_44.
Liu W, Wen Y, Yu Z, Li M, Raj B, Song L. Sphereface: Deep hypersphere embedding for face recognition. In: Proceedings of the IEEE Conference on computer vision and pattern recognition; 2017. p. 212–20.
Srivastava Y, Murali V, Dubey SR. Hard-mining loss based convolutional neural network for face recognition. arXiv preprint arXiv:1908.09747 (2019).
Hara K, Kataoka H, Satoh Y. Can spatiotemporal 3d cnns retrace the history of 2d cnns and imagenet? In: Proceedings of the IEEE Conference on computer vision and pattern recognition; 2018. p. 6546–55.
Hadsell R, Chopra S, LeCun Y. Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE Computer Society Conference on computer vision and pattern recognition (CVPR’06). 2006; vol. 2, pp. 1735–742 . IEEE.
Schroff F, Kalenichenko D, Philbin J. Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on computer vision and pattern recognition. 2015; pp. 815–23.
Srivastava Y, Murali V, Dubey SR. A performance evaluation of loss functions for deep face recognition. In: National Conference on computer vision, pattern recognition, image processing, and graphics. Springer; 2019. p. 322–32.
Wu J, Wang L. Arcgrad: angular gradient margin loss for classification. In: 2020 International Joint Conference on neural networks (IJCNN). IEEE; 2020. p. 1–8.
Lin T-Y, Goyal P, Girshick R, He K, Dollár P. Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on computer vision; 2017. p. 2980–88.
Acknowledgements
I would like to thank the Indian Institute of Information Technology, Allahabad, India, for supporting the work on \(HL_{3}CNN\) framework for face recognition.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Consent to participate
Informed consent was obtained from legal guardians.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection “Recent Trends in Computer Vision” guest edited by P. Nagabhushan, Balasubramaniyan Raman, Satish Kumar Singh and Subrahmanyam Murala.
Rights and permissions
About this article
Cite this article
Mishra, N.K., Singh, S.K. Face Recognition Using 3D CNN and Hardmining Loss Function. SN COMPUT. SCI. 3, 155 (2022). https://doi.org/10.1007/s42979-021-01009-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-021-01009-5