Abstract
Face recognition for surveillance remains a complex challenge due to the disparity between low-resolution (LR) face images captured by surveillance cameras and the typically high-resolution (HR) face images in databases. To address this cross-resolution face recognition problem, we propose a two-stage dual-resolution face network to learn more robust resolution-invariant representations. In the first stage, we pre-train the proposed dual-resolution face network using solely HR images. Our network utilizes a two-branch structure and introduces bilateral connections to fuse the high- and low-resolution features extracted by two branches, respectively. In the second stage, we introduce the triplet loss as the fine-tuning loss function and design a training strategy that combines the triplet loss with competence-based curriculum learning. According to the competence function, the pre-trained model can train first from easy sample sets and gradually progress to more challenging ones. Our method achieves a remarkable face verification accuracy of 99.25% on the native cross-quality dataset SCFace and 99.71% on the high-quality dataset LFW. Moreover, our method also enhances the face verification accuracy on the native low-quality dataset.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability Statement
The LFW, XQLFW, SCFace, TinyFace, and QMUL-SurvFace datasets that support the findings of this study are available at http://vis-www.cs.umass.edu/lfw/, https://martlgap.github.io/xqlfw/, https://www.scface.org/, https://qmul-tinyface.github.io/, and https://qmul-survface.github.io/, respectively.
References
Zangeneh, E., Rahmati, M., Mohsenzadeh, Y.: Low resolution face recognition using a two-branch deep convolutional neural network architecture. Expert Syst. Appl. 139, 112854 (2020). https://doi.org/10.1016/j.eswa.2019.112854
Ze, L., Jiang, X., Kot, A.: Deep coupled resnet for low-resolution face recognition. IEEE Signal Process. Lett. 25(4), 526–530 (2018). https://doi.org/10.1109/lsp.2018.2810121
Zha J., Chao, H.: Tcn: transferable coupled network for cross-resolution face recognition. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3302–3306. IEEE (2019). https://doi.org/10.1109/icassp.2019.8682384
Massoli, F.V., Amato, G., Falchi, F.: Cross-resolution learning for face recognition. Image Vis. Comput. 99, 103927 (2020). https://doi.org/10.1016/j.imavis.2020.103927
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015). https://doi.org/10.1109/cvpr.2015.7298682
Jiang, J., Wang, C., Liu, X., Ma, J.: Deep learning-based face super-resolution: a survey. ACM Comput. Surv. (CSUR) 55(1), 1–36 (2021). https://doi.org/10.1145/3485132
Ge, S., Zhang, K., Liu, H., Hua, Y., Zhao, S., Jin, X., Wen, H.: Look one and more: distilling hybrid order relational knowledge for cross-resolution image recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 10845–10852 (2020). https://doi.org/10.1609/aaai.v34i07.6715
Ge, S., Zhao, S., Li, C., Zhang, Y., Li, J.: Efficient low-resolution face recognition via bridge distillation. IEEE Trans. Image Process. 29, 6898–6908 (2020). https://doi.org/10.1109/tip.2020.2995049
Sun, J., Shen, Y., Yang, W., Liao, Q.: Classifier shared deep network with multi-hierarchy loss for low resolution face recognition. Signal Process. Image Commun. 82, 115766 (2020). https://doi.org/10.1016/j.image.2019.115766
Mudunuri, S.P., Sanyal, S., Biswas, S.: Genlr-net: deep framework for very low resolution face and object recognition with generalization to unseen categories. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 602–60209. IEEE (2018). https://doi.org/10.1109/cvprw.2018.00090
Lai S.-C., Lam, K.-M.: Deep Siamese network for low-resolution face recognition. In: 2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 1444–1449. IEEE (2021)
Konche, M., Elkadeem, M., Hörmann, S., Rigoll, G.: Octuplet loss: make face recognition robust to image resolution. In: 2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG), pp. 1–8. IEEE (2023). https://doi.org/10.1109/fg57933.2023.10042669
Wang, H., Wang, S., Fang, L.: Two-stage multi-scale resolution-adaptive network for low-resolution face recognition. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 4053–4062 (2022). https://doi.org/10.1145/3503161.3548196
Zhang, Y., Chu, J., Leng, L., Miao, J.: Mask-refined r-cnn: a network for refining object details in instance segmentation. Sensors 20(4), 1010 (2020). https://doi.org/10.3390/s20041010
Chu, J., Guo, Z., Leng, L.: Object detection based on multi-layer convolution feature fusion and online hard example mining. IEEE Access 6, 19959–19967 (2018). https://doi.org/10.1109/access.2018.2815149
Pan, H., Hong, Y., Sun, W., Jia, Y.: Deep dual-resolution networks for real-time and accurate semantic segmentation of traffic scenes. IEEE Trans. Intell. Transp. Syst. 24(3), 3448–3460 (2022). https://doi.org/10.1109/tits.2022.3228042
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016). https://doi.org/10.1109/cvpr.2016.90
Deng, J., Guo, J., Xue, N., Zafeiriou, S.: Arcface: additive angular margin loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4690–4699 (2019). https://doi.org/10.1109/cvpr.2019.00482
Knoche, M., Hörmann, S., Rigoll, G.: Image resolution susceptibility of face recognition models. arXiv e-prints, pp. arXiv–2107 (2021)
Platanios, E.A., Stretcu, O., Neubig, G., Poczos, B., Mitchell, T.M.: Competence-based curriculum learning for neural machine translation. arXiv preprint arXiv:1903.09848 (2019). https://doi.org/10.18653/v1/n19-1119
Guo, Y., Zhang, L., Hu, Y., He, X., Gao, J.: Ms-celeb-1m: a dataset and benchmark for large-scale face recognition. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part III 14, pp. 87–102. Springer (2016). https://doi.org/10.1007/978-3-319-46487-9_6
Huang, G.B., Mattar, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: a database forstudying face recognition in unconstrained environments. In: Workshop on faces in ’Real-Life’ Images: detection, alignment, and recognition (2008)
Knoche, M., Hormann, S., Rigoll, G.: Cross-quality lfw: a database for analyzing cross-resolution image face recognition in unconstrained environments. In: 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021), pp. 1–5. IEEE (2021). https://doi.org/10.1109/fg52635.2021.9666960
Cheng, Z., Zhu, X., Gong, S.: Low-resolution face recognition. In: Computer Vision–ACCV 2018: 14th Asian Conference on Computer Vision, Perth, Australia, December 2–6, 2018, Revised Selected Papers, Part III 14, pp. 605–621. Springer (2019)
Cheng, Z., Zhu, X., Gong, S.: Surveillance face recognition challenge. arXiv preprint arXiv:1804.09691 (2018)
Grgic, M., Delac, K., Grgic, S.: Scface-surveillance cameras face database. Multimed. Tools Appl. 51, 863–879 (2011). https://doi.org/10.1007/s11042-009-0417-2
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015). https://doi.org/10.1109/iccv.2015.123
Zhang, K., Zhang, Z., Li, Z., Qiao, Yu.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016). https://doi.org/10.1109/lsp.2016.2603342
. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(7) (2011)
Zhong Y., Deng, W.: Face transformer for recognition. arXiv preprint arXiv:2103.14803 (2021)
Chen, S., Liu, Y., Gao, X., Han, Z.: Mobilefacenets: efficient CNNS for accurate real-time face verification on mobile devices. In: Biometric Recognition: 13th Chinese Conference, CCBR 2018, Urumqi, China, August 11-12, 2018, Proceedings 13, pp. 428–438. Springer (2018). https://doi.org/10.1007/978-3-319-97909-0_46
Parkhi, O., Vedaldi, A., Zisserman, A.: Deep face recognition. In: BMVC 2015-Proceedings of the British Machine Vision Conference 2015. British Machine Vision Association (2015)
Sun, Y., Chen, Y., Wang, X., Tang, X.: Deep learning face representation by joint identification-verification. Adv. Neural Inf. Process. Syst. 27 (2014)
Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: Sphereface: deep hypersphere embedding for face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 212–220 (2017). https://doi.org/10.1109/cvpr.2017.713
Wen, Y., Zhang, K., Li, Z., Qiao, Y.: A discriminative feature learning approach for deep face recognition. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part VII 14, pp. 499–515. Springer (2016). https://doi.org/10.1007/978-3-319-46478-7_31
Khalid, S.S., Awais, M., Feng, Z.-H., Chan, C.-H., Farooq, A., Akbari, A., Kittler, J.: Resolution invariant face recognition using a distillation approach. IEEE Trans. Biom. Behav. Identity Sci 2(4), 410–420 (2020). https://doi.org/10.1109/tbiom.2020.3007356
Yin, X., Tai, Y., Huang, Y., Liu, X.: Fan: feature adaptation network for surveillance face recognition and normalization. In: Proceedings of the Asian Conference on Computer Vision (2020). https://doi.org/10.1007/978-3-030-69532-3_19
Fang, Hn., Deng, W., Zhong, Y., Hu, J.: Generate to adapt: resolution adaption network for surveillance face recognition. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XV 16, pp. 741–758. Springer (2020). https://doi.org/10.1007/978-3-030-58555-6_44
Acknowledgements
This study was supported by the National Natural Science Foundation of China under Grant 62071125 and the Natural Science Foundation of Fujian Province under Grants 2021J01581 and 2018J01805.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, L., Chen, J., Xu, Z. et al. Two-stage dual-resolution face network for cross-resolution face recognition in surveillance systems. Vis Comput 40, 5545–5556 (2024). https://doi.org/10.1007/s00371-023-03121-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-023-03121-4