Abstract
Multi-view face detection in open environments is a challenging task due to the diverse variations of face appearances and occlusion. In the task of face detection, localization accuracy is one of the key factors. However, many of the existing methods do not pay enough attention to localization. Some of the current methods have applied localization techniques, but they have not fully realized its potential and realized more accurate localization. In this paper, we propose a deep cascaded detection method that iteratively exploits bounding-box regression, a localization technique, to approach the detection of potential faces in images. In addition, we consider the inherent correlation of classification and bounding-box regression and exploit it to further increase overall performance. In particular, our method leverages a cascaded architecture with three stages of carefully designed deep convolutional networks to predict the existence of faces. Extensive experiments demonstrate the efficiency of our algorithm by comparing it with several popular face-detection algorithms on the widely used AFW and FDDB datasets.
Similar content being viewed by others
References
Cellerino A, Borghetti D, Sartucci F (2004) Sex differences in face gender recognition in humans. Brain Res Bull 63(6):443–449
Chen D, Ren S, Wei Y, Cao X, Sun J (2014) Joint cascade face detection and alignment. In: European conference on computer vision. Springer, Cham, pp 109–122
Felzenszwalb P, Mcallester D, Ramanan D (2008) A discriminatively trained, multiscale, deformable part model. In: IEEE computer society conference on computer vision and pattern recognition, pp 1–8
Girshick R (2015) Fast R-CNN. In: IEEE international conference on computer vision. IEEE Computer Society, pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2013) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Computer vision and pattern recognition, pp 580–587
Iandola F, Moskewicz M, Karayev S, Girshick R, Darrell T, Keutzer K (2014) Densenet: implementing efficient convnet descriptor pyramids. Eprint arXiv
Jain V, Learned-Miller E (2010) FDDB: A benchmark for face detection in unconstrained settings. UMass Amherst Technical Report
Jain V, Learned-Miller E (2011) Online domain adaptation of a pre-trained cascade of classifiers. In: IEEE conference on computer vision and pattern recognition, pp 577–584
Kaipeng Zhang ZL, Zhang Z (2016) Joint face detection and alignment using multi-task cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: International conference on neural information processing systems, pp 1097–1105
Li J, Wang T, Zhang Y (2011) Face detection using surf cascade. In: IEEE international conference on computer vision workshops, pp 2183–2190
Li H, Lin Z, Brandt J, Shen X, Hua G (2014) Efficient boosted exemplar-based face detection. In: Computer vision and pattern recognition, pp 1843–1850
Li H et al (2015) A convolutional neural network cascade for face detection. Computer Vision and Pattern Recognition IEEE 57(2):643–650
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision, pp 21–37
Lu X, Jain AK (2004) Ethnicity identification from face images. Proc Spie 5404:114–123
Markuš N, Frljak M, Pandžić IS et al. (2013) Object detection with pixel intensity comparisons organized in decision trees. Computer Science 14(4):2657–62
Mathias M, Benenson R, Pedersoli M, Gool LV (2014) Face detection without bells and whistles. In: European conference on computer vision, vol 8692. Springer, Cham, pp 720–735
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Pantic M (2014) Automatic analysis of facial expressions. In: ACM/IEEE international conference on human-robot interaction, pp 390–390
Ramanan D, Zhu X (2012) Face detection, pose estimation, and landmark localization in the wild. In: IEEE conference on computer vision and pattern recognition, pp 2879–2886
Ramanan D, Zhu X (2012) Face detection, pose estimation, and landmark localization in the wild. In: IEEE conference on computer vision and pattern recognition, pp 2879–2886
Ramesha K, Raja KB, Venugopal KR, Patnaik LM (2010) Feature extraction based face recognition, gender and age classification. International Journal of Advanced Trends in Computer Science & Engineering 2(1):14–23
Ranjan R, Patel VM, Chellappa R (2016) Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans Pattern Anal Mach Intell PP(99):1–1
Shen X, Lin Z, Brandt J, Wu Y (2013) Detecting and aligning faces by image retrieval. In: Computer vision and pattern recognition, pp 3460–3467
Subburaman VB, Marcel S (2010) Fast bounding box estimation based face detection. In ECCV, workshop on face detection: Where we are, and what next? (No. EPFL-CONF-155015)
Sun Y, Chen Y, Wang X, Tang X (2014) Deep learning face representation by joint identification-verification. In: Advances in neural information processing systems, pp 1988–1996
Szarvas M, Yoshizawa A, Yamamoto M, Ogata J (2015) Multi-view face detection using deep convolutional neural networks. 643–650
Uijlings JRR, Sande KEAVD, Gevers T, Smeulders AWM (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171
Viola P, Jones M (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154
Wu B, Ai H, Huang C, Lao S (2004) Fast rotation invariant multi-view face detection based on real adaboost. In: IEEE international conference on automatic face and gesture recognition, pp 79–84
Yan J, Zhang X, Lei Z, Li SZ (2013) Face detection by structural models. Image Vis Comput 32(10):790–799
Yang B, Yan J, Lei Z, Li SZ (2014) Aggregate channel features for multi-view face detection. In: IEEE international joint conference on biometrics, pp 1–8
Yang B, Yan J, Lei Z, Li SZ (2015) Convolutional channel features. In: IEEE international conference on computer vision. IEEE, pp 82–90
Yang S, Luo P, Loy CC, Tang X (2015) From facial parts responses to face detection: a deep learning approach. In: IEEE international conference on computer vision, pp 3676–3684
Zafeiriou S, Zhang C, Zhang Z (2015) A survey on face detection in the wild: past, present and future. Comput Vis Image Underst 138:1–24
Zafeiriou S, Zhang C, Zhang Z (2015) A survey on face detection in the wild: Past, present and future ✩. Elsevier Science Inc.
Zhao W, Chellappa R, Phillips PJ, Rosenfeld A (2003) Face recognition:a literature survey. ACM Comput Surv 35(4):399–458
Zhu C, Zheng Y, Luu K, Savvides M (2017) CMS-RCNN: contextual multi-scale region-based CNN for unconstrained face detection. In: Deep learning for biometrics. Springer, Cham, pp 57–79
Zitnick CL, Dollár P (2014) Edge boxes: Locating object proposals from edges. In: European conference on computer vision, pp 391–405
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Luo, D., Wen, G., Li, D. et al. Deep-learning-based face detection using iterative bounding-box regression. Multimed Tools Appl 77, 24663–24680 (2018). https://doi.org/10.1007/s11042-018-5658-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-5658-5