Abstract
Face detection has been deployed on edge devices as the basis for face applications, but the devices cannot store large-scale models and have low computing power. The existing anchor-based face detection schemes cannot cover face images over a continuous size range, and their performance is not satisfactory. Obviously, good performances are accompanied by increased storage and lower speed. We find that the feature points in different layers correspond to a specific size range of RFs (receptive fields). According to the survey, the predictable range of RFs with the same size is the face on a continuous scale. Therefore, we argue that RFs are inherent anchors. A Light and Fast Face Detector with an Ommateum Structure (OS-LFFD) is proposed in this paper. By analyzing the correlation between the effective receptive field (ERF) and face sizes, a 4-branch network is designed to cover the objective range of face sizes. Each branch involves an ommateum block with a similar structure and shared parameters. It reduces the number of model parameters (8 M), which makes it much smaller than most face detectors. Experiments on the popular benchmarks WIDER FACE and FDDB using multiple hardware platforms demonstrate that the proposed scheme can considerably balance the accuracy and running speed.
Similar content being viewed by others
References
Brubaker S, et al. (2008) Charles on the design of cascades of boosted ensembles for face detection. Int J Comput Vis 77:65–86
Chen T, Li M, Li Y, Lin M, Wang N, Wang M, Xiao T, Xu B, Zhang C, Zhang Z (2015) Mxnet: a flexible and efficient machine learning library for heterogeneous distributed systems. arXiv:1512.01274
Chi C, Zhang S, Xing J, Lei Z, Li SZ, Zou X (2018) Selective refinement network for high performance face detection. arXiv:1809.02693
Dollar P, Appel R, Belongie S, Perona P (2014) Fast feature pyramids for object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 36(8):1532–1545
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Girshick R (2015) Fast r-cnn. In: IEEE international conference on computer vision (ICCV), pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. Proc IEEE Conf Comput Vis Pattern Recognit 580–587
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Proc IEEE Conf Comput Vis Pattern Recognit 770–778
Howard AG (2013) Some improvements on deep convolutional neural network based image classification. arXiv:1312.5402
Hu P, Ramanan D (2017) Finding tiny faces. Proc IEEE Conf Comput Vis Pattern Recognit 951–959
Huang G, Liu Z, van der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. Proc IEEE Conf Comput Vis Pattern Recognit 4700–4708
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167
Jain V, Learned-Miller E (2010) Fddb: a benchmark for face detection in unconstrained settings. Technical report, University of Massachusetts, Amherst
Jiang H, Learned-Miller E (2017) Face detection with the faster r-cnn. In: Proceedings of IEEE international conference on automatic face & gesture recognition, pp 650–657
Jin X, Tan X (2017) Face alignment in-the-wild: a survey. Comput Vis Image Underst 162:1–22
Kim KH, Hong S, Roh B, et al. (2016) Pvanet: deep but lightweight neural networks for real-time object detection. arXiv:1608.08021
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of advances in neural information processing systems, pp 1097–1105
Li J, Wang Y, Wang C, Tai Y, Qian J, Yang J, Wang C, Li J, Huang F (2019) Dsfd: dual shot face detector. Proc IEEE Conf Comput Vis Pattern Recognit
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: single shot multibox detector. In: Proceedings of European conference on computer vision, pp 21–37
Luo W, Li Y, Urtasun R, Zemel R (2016) Understanding the effective receptive field in deep convolutional neural networks. In: Proceedings of advances in neural information processing systems, pp 4898–4906
Molchanov P, Tyree S, Karras T, Aila T, Kautz J (2017) Pruning convolutional neural networks for resource efficient inference. International conference on learning representations (ICCV)
Najibi M, Samangouei P, Chellappa R, Davis LS (2017) Ssh: single stage headless face detector. In: Proceedings of IEEE international conference on computer vision, pp 4875–4884
Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(7):971–987
Pham MT, Cham TJ (2007) Fast training and selection of haar features using statistics in boosting-based face detection. In: Proceedings of IEEE international conference on computer vision, pp 1–7
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. Proc IEEE Conf Comput Vis Pattern Recognit 779–788
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. Proc IEEE Conf Comput Vis Pattern Recognit 7263–7271
Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv:1804.02767
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Proceedings of advances in neural information processing systems, pp 91–99
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Szegedy C, Liu W, Jia Y, et al. (2015) Going deeper with convolutions. Proc IEEE Conf Comput Vis Pattern Recognit 1–9
Szegedy C, Vanhoucke V, Ioffe S, et al. (2016) Rethinking the inception architecture for computer vision. Proc IEEE Conf Comput Vis Pattern Recognit 2818–2826
Szegedy C, Ioffe S, Vanhoucke V, et al. (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-First AAAI conference on artificial intelligence
Tang X, Du DK, He Z, Liu J (2018) Pyramidbox: a context-assisted single shot face detector. In: Proceedings of European conference on computer vision, pp 797–813
Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154
Wang H, Li Z, Ji X, Wang Y (2017) Face r-cnn. arXiv:1706.01061
Wang M, Deng W (2018) Deep face recognition: a survey. arXiv:1804.06655
Yang B, Yan J, Lei Z, Li SZ (2014) Aggregate channel features for multi-view face detection. In: Proceedings of IEEE international joint conference on biometrics, pp 1–8
Yang S, Luo P, Loy CC, Tang X (2016) Wider face: a face detection benchmark. Proc IEEE Conf Comput Vis Pattern Recognit 5525–5533
Zhang K, Zhang Z, Li Z, Qiao Y (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett (SPL) 23(10):1499–1503
Zhang S, Zhu R, Wang X, Shi H, Fu T, Wang S, Mei T, Li SZ (2019) Improved selective refinement network for face detection. arXiv:1901.06651
Zhang S, Zhu X, Lei Z, Shi H, Wang X, Li SZ (2017) Faceboxes: a CPU real-time face detector with high accuracy. In: Proceedings of IEEE international joint conference on biometrics, pp 1–9
Zhang S, Zhu X, Lei Z, Shi H, Wang X, Li SZ (2017) S3FD: single shot scale-invariant face detector. In: Proceedings of IEEE international conference on computer vision, pp 192–201
Zhang Y, Xu X, Liu X (2019) Robust and high performance face detector. arXiv:1901.02350
Zhu C, Tao R, Luu K, Savvides M (2018) Seeing small faces from robust anchor’s perspective. Proc IEEE Conf Comput Vis Pattern Recognit 5127–5136
Zhu Q, Yeh MC, Cheng KT, Avidan S (2006) Fast human detection using a cascade of histograms of oriented gradients. Proc IEEE Conf Comput Vis Pattern Recognit 1491–1498
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China (61976010,61702022,61802011), Beijing Municipal Education Committee Science Foundation KM201910005024, China Postdoctoral Science Foundation Funded Project (2018M640033), Beijing Excellent Young Talent Cultivation Project (2017000020124G075), and Beijing University of Technology “Ri xin” Cultivation Project.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xu, D., Wu, L., He, Y. et al. OS-LFFD: a light and fast face detector with Ommateum structure. Multimed Tools Appl 80, 34153–34172 (2021). https://doi.org/10.1007/s11042-020-09143-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09143-7