Abstract
Efficient face detectors with low computational cost and fast speed are still a pressing problem despite the significant progress made in uncontrolled face detection. To address this issue, we propose two lightweight face detectors for CPU device, named speed-priority face detector (SPFD) and accuracy-priority face detector (APFD). In the case of SPFD, we propose a simplified version of FaceBoxes by reducing convolutional layers and channel number for reducing model complexity and computational cost, as well as replacing max pooling layers with group convolutional layers to learn local information and features’ integration. Additionally, large kernel attention modules based on prior knowledge of face size and network architecture are applied to increase the expression ability of features and capture the key information from long distances. Finally, an iterative retrained method is designed to further enhance the accuracy without increasing any cost during model testing. Regarding APFD, a new anchor generation strategy is utilized to find out more faces based on SPFD. Extensive experiments conducted on WIDER FACE validation dataset indicate that our detectors exceed FaceBoxes comprehensively in terms of accuracy and speed. Specifically, SPFD outperforms FaceBoxes by 4.1%, 8.6%, and 9.6% on WIDEFACE validation dataset while achieving the fastest detection speed on CPU device for VGA-resolution images. The average speed can reach 45FPS on CPU devices, while the size of its parameters is only 0.66 times of FaceBoxes. Moreover, the APFD outperforms many famous and lightweight face detectors and attains superior accuracy (easy: 91.4%, medium:88.1%, and hard: 64.7%) with 20FPS; it achieves the best trade-off between accuracy and speed for face detection.













Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016). https://doi.org/10.1109/LSP.2016.2603342
Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., Li, S.Z.: FaceBoxes: a CPU real-time face detector with high accuracy. In: 2017 IEEE International Joint Conference on Biometrics, Denver, CO, USA, pp. 1–9. 10.1109/BTAS.2017.8272675 (2017)
He, Y., Xu, D., Wu, L. et al. LFFD: A Light and Fast Face Detector for Edge Devices . arXiv1904.10633 (2019)
Yoon, Jongmin, Kim, Daijin: An accurate and real-time multi-view face detector using ORFs and doubly domain-partitioning classifier. J. Real-Time Image Process. 16(6), 2425–2440 (2019)
Yang, Z., Li, J., Min, W., et al.: Real-time pre-identification and cascaded detection for tiny faces. Appl. Sci. 9(20), 4344 (2019)
Zhang, H., Wang, X., Zhu, J., JayKuo, C.-C.: Fast face detection on mobile devices by leveraging global and local facial characteristics. Signal Process. Image Commun. 78, 1–8 (2019). https://doi.org/10.1016/j.image.2019.05.016
RetinaFaceJ. Deng, J., Guo, Y., Zhou, J., Yu, I.K., Zafeiriou, S.: RetinaFace: Single-Stage Dense Face Localisation in the Wild. arXiv1905.00641 (2019)
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D.: MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv1704.04861 (2017)
Guo, J., Deng, J., Liu, Z., Lattas, A., Zafeiriou, S.: Sample and Computation Redistribution for Efficient Face Detection. arXiv:2105.04714 (2021)
Liu, Z., Deng, J., Wang, F ., Shang, L., Xie, X., Sun, B.: DamoFD: digging into backbone design on face detection. In: 2023 IEEE/CVF Conference on International Conference on Learning Representations, Kigali, Rwanda, KGL (2023)
Qi, D., Tan, W., Yao, Q. , Liu, J.: YOLO5Face: why reinventing a face detector. In: Lecture Notes in Computer Science, Vol. 13805 LNCS, pp. 228–244. arXiv:2105.12931 (2021)
Xu, Y. et al. DBface. https://github.com/dlunion/dbface (2020)
Xu, Y., Yan, W., Yang, G., et al.: CenterFace: joint face detection and alignment using face as point. Sci. Program. (2019). https://doi.org/10.1016/j.image.2019.05.016
Linzai, et al. Ultraface. https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB (2020)
Feng, Y., Yu, S., Peng, H., Li, Y.-R., Zhang, J.: Detect faces efficiently: a survey and evaluations. IEEE Trans. Biometrics Behav. Identity Sci. 4(1), 1–18 (2022). https://doi.org/10.1109/TBIOM.2021.3120412
hpc203: https://github.com/hpc203/10kinds-light-face-detector-align-recognition (2021)
Guo, M., Lu, C., Liu, Z., Cheng, M., Hu, S.: Visual Attention Network. arXiv:2202.09741 (2022)
Yang, S., Luo, P., Loy, C. C., Tang, X.: WIDER FACE: a face detection benchmark. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, pp. 5525–5533. 10.1109/CVPR.2016.596 (2016)
Jain, Vidit, Learned-Miller, Erik: Technical Report UM-CS-2010-009. FDDB: a benchmark for face detection in unconstrained settings, Dept. of Computer Science, University of Massachusetts, Amherst (2010)
Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, June 16–21, 2012, pp. 2879–2886 (2012)
Yan, J., Zhang, X., Lei, Z., Li, S.Z.: Face detection by structural models. In: 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, vol. 32(10), pp. 790–799 (2014)
Qiao, S., Chen, L.-C., Yuille, A.: DetectoRS: detecting objects with recursive feature pyramid and switchable Atrous convolution. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, pp. 10208–10219. https://doi.org/10.1109/CVPR46437.2021.01008 (2021)
Yu, J., Jiang, Y., Wang, Z., Cao, Z., Huang, T.: UnitBox: an advanced object detection network. In: Proceedings of the 2016 ACM multimedia conference, New York, NY, USA, pp. 516–520. 10.1145/2964284.2967274 (2016)
Dong, C., Ren, S., Wei, Y., Cao, X., Jian, S.: Joint cascade face detection and alignment. In: 2014 European Conference on Computer Vision, Zurich, Switzerland, pp. 109–122. 10.1007/978-3-319-10599-4_8 (2014)
Yang, S., Luo, P., Loy, C. , Tang, X.: From facial parts responses to face detection: a deep learning approach. In: 2015 IEEE International Conference on Computer Vision, Santiago, Chile, pp. 3676–3684. 10.1109/ICCV.2015.419 (2015)
Ohn-Bar, E., Trivedi, M.M.: To boost or not to boost? On the limits of boosted trees for object detection. In: 2016 23rd International Conference on Pattern Recognition, Cancun, Mexico, pp. 3350–3355. 10.1109/ICPR.2016.7900151 (2016)
Triantafyllidou, Danai, Tefas, Anastasios: A fast deep convolutional neural network for face detection in big visual data. Adv. Intell. Syst. Comput. 529, 61–70 (2016). https://doi.org/10.1007/978-3-319-47898-2_7
Mathias, M., Benenson, R., Pedersoli, M., Gool, L.J.V.: Face detection without bells and whistles. Eur. Conf. Comput. Vis. 8692, 720–735 (2014)
Yan, J., Zhang, X., Lei, Z., Yi, D., Li, S.Z.: Structural models for face detection. In: 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), Shanghai, China, vol. 2013, pp. 1–6 (2013). https://doi.org/10.1109/FG.2013.6553703
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China (Grant Nos 61872448) and the Natural Science Basic Research Plan in Shanxi Province of China (No. 2021JQ-379).
Author information
Authors and Affiliations
Contributions
SQ and XS proposed the network structure. SQ and ZL wrote the program code and undertake experimental work. All the authors wrote the main manuscript text and reviewed the manuscript.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Qi, S., Song, X., Li, Z. et al. Fast and efficient face detector based on large kernel attention for CPU device. J Real-Time Image Proc 20, 72 (2023). https://doi.org/10.1007/s11554-023-01326-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11554-023-01326-3