Skip to main content
Log in

Deep-learning-based face detection using iterative bounding-box regression

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Multi-view face detection in open environments is a challenging task due to the diverse variations of face appearances and occlusion. In the task of face detection, localization accuracy is one of the key factors. However, many of the existing methods do not pay enough attention to localization. Some of the current methods have applied localization techniques, but they have not fully realized its potential and realized more accurate localization. In this paper, we propose a deep cascaded detection method that iteratively exploits bounding-box regression, a localization technique, to approach the detection of potential faces in images. In addition, we consider the inherent correlation of classification and bounding-box regression and exploit it to further increase overall performance. In particular, our method leverages a cascaded architecture with three stages of carefully designed deep convolutional networks to predict the existence of faces. Extensive experiments demonstrate the efficiency of our algorithm by comparing it with several popular face-detection algorithms on the widely used AFW and FDDB datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Cellerino A, Borghetti D, Sartucci F (2004) Sex differences in face gender recognition in humans. Brain Res Bull 63(6):443–449

    Article  Google Scholar 

  2. Chen D, Ren S, Wei Y, Cao X, Sun J (2014) Joint cascade face detection and alignment. In: European conference on computer vision. Springer, Cham, pp 109–122

  3. Felzenszwalb P, Mcallester D, Ramanan D (2008) A discriminatively trained, multiscale, deformable part model. In: IEEE computer society conference on computer vision and pattern recognition, pp 1–8

  4. Girshick R (2015) Fast R-CNN. In: IEEE international conference on computer vision. IEEE Computer Society, pp 1440–1448

  5. Girshick R, Donahue J, Darrell T, Malik J (2013) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Computer vision and pattern recognition, pp 580–587

  6. Iandola F, Moskewicz M, Karayev S, Girshick R, Darrell T, Keutzer K (2014) Densenet: implementing efficient convnet descriptor pyramids. Eprint arXiv

  7. Jain V, Learned-Miller E (2010) FDDB: A benchmark for face detection in unconstrained settings. UMass Amherst Technical Report

  8. Jain V, Learned-Miller E (2011) Online domain adaptation of a pre-trained cascade of classifiers. In: IEEE conference on computer vision and pattern recognition, pp 577–584

  9. Kaipeng Zhang ZL, Zhang Z (2016) Joint face detection and alignment using multi-task cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503

    Article  Google Scholar 

  10. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: International conference on neural information processing systems, pp 1097–1105

  11. Li J, Wang T, Zhang Y (2011) Face detection using surf cascade. In: IEEE international conference on computer vision workshops, pp 2183–2190

  12. Li H, Lin Z, Brandt J, Shen X, Hua G (2014) Efficient boosted exemplar-based face detection. In: Computer vision and pattern recognition, pp 1843–1850

  13. Li H et al (2015) A convolutional neural network cascade for face detection. Computer Vision and Pattern Recognition IEEE 57(2):643–650

    Google Scholar 

  14. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision, pp 21–37

  15. Lu X, Jain AK (2004) Ethnicity identification from face images. Proc Spie 5404:114–123

    Article  Google Scholar 

  16. Markuš N, Frljak M, Pandžić IS et al. (2013) Object detection with pixel intensity comparisons organized in decision trees. Computer Science 14(4):2657–62

    Google Scholar 

  17. Mathias M, Benenson R, Pedersoli M, Gool LV (2014) Face detection without bells and whistles. In: European conference on computer vision, vol 8692. Springer, Cham, pp 720–735

  18. Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359

    Article  Google Scholar 

  19. Pantic M (2014) Automatic analysis of facial expressions. In: ACM/IEEE international conference on human-robot interaction, pp 390–390

  20. Ramanan D, Zhu X (2012) Face detection, pose estimation, and landmark localization in the wild. In: IEEE conference on computer vision and pattern recognition, pp 2879–2886

  21. Ramanan D, Zhu X (2012) Face detection, pose estimation, and landmark localization in the wild. In: IEEE conference on computer vision and pattern recognition, pp 2879–2886

  22. Ramesha K, Raja KB, Venugopal KR, Patnaik LM (2010) Feature extraction based face recognition, gender and age classification. International Journal of Advanced Trends in Computer Science & Engineering 2(1):14–23

    Google Scholar 

  23. Ranjan R, Patel VM, Chellappa R (2016) Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans Pattern Anal Mach Intell PP(99):1–1

    Google Scholar 

  24. Shen X, Lin Z, Brandt J, Wu Y (2013) Detecting and aligning faces by image retrieval. In: Computer vision and pattern recognition, pp 3460–3467

  25. Subburaman VB, Marcel S (2010) Fast bounding box estimation based face detection. In ECCV, workshop on face detection: Where we are, and what next? (No. EPFL-CONF-155015)

  26. Sun Y, Chen Y, Wang X, Tang X (2014) Deep learning face representation by joint identification-verification. In: Advances in neural information processing systems, pp 1988–1996

  27. Szarvas M, Yoshizawa A, Yamamoto M, Ogata J (2015) Multi-view face detection using deep convolutional neural networks. 643–650

  28. Uijlings JRR, Sande KEAVD, Gevers T, Smeulders AWM (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171

    Article  Google Scholar 

  29. Viola P, Jones M (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154

    Article  Google Scholar 

  30. Wu B, Ai H, Huang C, Lao S (2004) Fast rotation invariant multi-view face detection based on real adaboost. In: IEEE international conference on automatic face and gesture recognition, pp 79–84

  31. Yan J, Zhang X, Lei Z, Li SZ (2013) Face detection by structural models. Image Vis Comput 32(10):790–799

    Article  Google Scholar 

  32. Yang B, Yan J, Lei Z, Li SZ (2014) Aggregate channel features for multi-view face detection. In: IEEE international joint conference on biometrics, pp 1–8

  33. Yang B, Yan J, Lei Z, Li SZ (2015) Convolutional channel features. In: IEEE international conference on computer vision. IEEE, pp 82–90

  34. Yang S, Luo P, Loy CC, Tang X (2015) From facial parts responses to face detection: a deep learning approach. In: IEEE international conference on computer vision, pp 3676–3684

  35. Zafeiriou S, Zhang C, Zhang Z (2015) A survey on face detection in the wild: past, present and future. Comput Vis Image Underst 138:1–24

    Article  Google Scholar 

  36. Zafeiriou S, Zhang C, Zhang Z (2015) A survey on face detection in the wild: Past, present and future ✩. Elsevier Science Inc.

  37. Zhao W, Chellappa R, Phillips PJ, Rosenfeld A (2003) Face recognition:a literature survey. ACM Comput Surv 35(4):399–458

    Article  Google Scholar 

  38. Zhu C, Zheng Y, Luu K, Savvides M (2017) CMS-RCNN: contextual multi-scale region-based CNN for unconstrained face detection. In: Deep learning for biometrics. Springer, Cham, pp 57–79

  39. Zitnick CL, Dollár P (2014) Edge boxes: Locating object proposals from edges. In: European conference on computer vision, pp 391–405

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dazhi Luo.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, D., Wen, G., Li, D. et al. Deep-learning-based face detection using iterative bounding-box regression. Multimed Tools Appl 77, 24663–24680 (2018). https://doi.org/10.1007/s11042-018-5658-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-5658-5

Keywords

Navigation