Skip to main content
Log in

OS-LFFD: a light and fast face detector with Ommateum structure

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Face detection has been deployed on edge devices as the basis for face applications, but the devices cannot store large-scale models and have low computing power. The existing anchor-based face detection schemes cannot cover face images over a continuous size range, and their performance is not satisfactory. Obviously, good performances are accompanied by increased storage and lower speed. We find that the feature points in different layers correspond to a specific size range of RFs (receptive fields). According to the survey, the predictable range of RFs with the same size is the face on a continuous scale. Therefore, we argue that RFs are inherent anchors. A Light and Fast Face Detector with an Ommateum Structure (OS-LFFD) is proposed in this paper. By analyzing the correlation between the effective receptive field (ERF) and face sizes, a 4-branch network is designed to cover the objective range of face sizes. Each branch involves an ommateum block with a similar structure and shared parameters. It reduces the number of model parameters (8 M), which makes it much smaller than most face detectors. Experiments on the popular benchmarks WIDER FACE and FDDB using multiple hardware platforms demonstrate that the proposed scheme can considerably balance the accuracy and running speed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Brubaker S, et al. (2008) Charles on the design of cascades of boosted ensembles for face detection. Int J Comput Vis 77:65–86

    Article  Google Scholar 

  2. Chen T, Li M, Li Y, Lin M, Wang N, Wang M, Xiao T, Xu B, Zhang C, Zhang Z (2015) Mxnet: a flexible and efficient machine learning library for heterogeneous distributed systems. arXiv:1512.01274

  3. Chi C, Zhang S, Xing J, Lei Z, Li SZ, Zou X (2018) Selective refinement network for high performance face detection. arXiv:1809.02693

  4. Dollar P, Appel R, Belongie S, Perona P (2014) Fast feature pyramids for object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 36(8):1532–1545

    Article  Google Scholar 

  5. Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139

    Article  MathSciNet  Google Scholar 

  6. Girshick R (2015) Fast r-cnn. In: IEEE international conference on computer vision (ICCV), pp 1440–1448

  7. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. Proc IEEE Conf Comput Vis Pattern Recognit 580–587

  8. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Proc IEEE Conf Comput Vis Pattern Recognit 770–778

  9. Howard AG (2013) Some improvements on deep convolutional neural network based image classification. arXiv:1312.5402

  10. Hu P, Ramanan D (2017) Finding tiny faces. Proc IEEE Conf Comput Vis Pattern Recognit 951–959

  11. Huang G, Liu Z, van der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. Proc IEEE Conf Comput Vis Pattern Recognit 4700–4708

  12. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167

  13. Jain V, Learned-Miller E (2010) Fddb: a benchmark for face detection in unconstrained settings. Technical report, University of Massachusetts, Amherst

  14. Jiang H, Learned-Miller E (2017) Face detection with the faster r-cnn. In: Proceedings of IEEE international conference on automatic face & gesture recognition, pp 650–657

  15. Jin X, Tan X (2017) Face alignment in-the-wild: a survey. Comput Vis Image Underst 162:1–22

    Article  Google Scholar 

  16. Kim KH, Hong S, Roh B, et al. (2016) Pvanet: deep but lightweight neural networks for real-time object detection. arXiv:1608.08021

  17. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of advances in neural information processing systems, pp 1097–1105

  18. Li J, Wang Y, Wang C, Tai Y, Qian J, Yang J, Wang C, Li J, Huang F (2019) Dsfd: dual shot face detector. Proc IEEE Conf Comput Vis Pattern Recognit

  19. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: single shot multibox detector. In: Proceedings of European conference on computer vision, pp 21–37

  20. Luo W, Li Y, Urtasun R, Zemel R (2016) Understanding the effective receptive field in deep convolutional neural networks. In: Proceedings of advances in neural information processing systems, pp 4898–4906

  21. Molchanov P, Tyree S, Karras T, Aila T, Kautz J (2017) Pruning convolutional neural networks for resource efficient inference. International conference on learning representations (ICCV)

  22. Najibi M, Samangouei P, Chellappa R, Davis LS (2017) Ssh: single stage headless face detector. In: Proceedings of IEEE international conference on computer vision, pp 4875–4884

  23. Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(7):971–987

    Article  Google Scholar 

  24. Pham MT, Cham TJ (2007) Fast training and selection of haar features using statistics in boosting-based face detection. In: Proceedings of IEEE international conference on computer vision, pp 1–7

  25. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. Proc IEEE Conf Comput Vis Pattern Recognit 779–788

  26. Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. Proc IEEE Conf Comput Vis Pattern Recognit 7263–7271

  27. Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv:1804.02767

  28. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Proceedings of advances in neural information processing systems, pp 91–99

  29. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  30. Szegedy C, Liu W, Jia Y, et al. (2015) Going deeper with convolutions. Proc IEEE Conf Comput Vis Pattern Recognit 1–9

  31. Szegedy C, Vanhoucke V, Ioffe S, et al. (2016) Rethinking the inception architecture for computer vision. Proc IEEE Conf Comput Vis Pattern Recognit 2818–2826

  32. Szegedy C, Ioffe S, Vanhoucke V, et al. (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-First AAAI conference on artificial intelligence

  33. Tang X, Du DK, He Z, Liu J (2018) Pyramidbox: a context-assisted single shot face detector. In: Proceedings of European conference on computer vision, pp 797–813

  34. Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154

    Article  Google Scholar 

  35. Wang H, Li Z, Ji X, Wang Y (2017) Face r-cnn. arXiv:1706.01061

  36. Wang M, Deng W (2018) Deep face recognition: a survey. arXiv:1804.06655

  37. Yang B, Yan J, Lei Z, Li SZ (2014) Aggregate channel features for multi-view face detection. In: Proceedings of IEEE international joint conference on biometrics, pp 1–8

  38. Yang S, Luo P, Loy CC, Tang X (2016) Wider face: a face detection benchmark. Proc IEEE Conf Comput Vis Pattern Recognit 5525–5533

  39. Zhang K, Zhang Z, Li Z, Qiao Y (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett (SPL) 23(10):1499–1503

    Article  Google Scholar 

  40. Zhang S, Zhu R, Wang X, Shi H, Fu T, Wang S, Mei T, Li SZ (2019) Improved selective refinement network for face detection. arXiv:1901.06651

  41. Zhang S, Zhu X, Lei Z, Shi H, Wang X, Li SZ (2017) Faceboxes: a CPU real-time face detector with high accuracy. In: Proceedings of IEEE international joint conference on biometrics, pp 1–9

  42. Zhang S, Zhu X, Lei Z, Shi H, Wang X, Li SZ (2017) S3FD: single shot scale-invariant face detector. In: Proceedings of IEEE international conference on computer vision, pp 192–201

  43. Zhang Y, Xu X, Liu X (2019) Robust and high performance face detector. arXiv:1901.02350

  44. Zhu C, Tao R, Luu K, Savvides M (2018) Seeing small faces from robust anchor’s perspective. Proc IEEE Conf Comput Vis Pattern Recognit 5127–5136

  45. Zhu Q, Yeh MC, Cheng KT, Avidan S (2006) Fast human detection using a cascade of histograms of oriented gradients. Proc IEEE Conf Comput Vis Pattern Recognit 1491–1498

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (61976010,61702022,61802011), Beijing Municipal Education Committee Science Foundation KM201910005024, China Postdoctoral Science Foundation Funded Project (2018M640033), Beijing Excellent Young Talent Cultivation Project (2017000020124G075), and Beijing University of Technology “Ri xin” Cultivation Project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qing Zhao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, D., Wu, L., He, Y. et al. OS-LFFD: a light and fast face detector with Ommateum structure. Multimed Tools Appl 80, 34153–34172 (2021). https://doi.org/10.1007/s11042-020-09143-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09143-7

Keywords

Navigation