Abstract
Rotation-invariant face detection (RIFD) aims to detect faces with arbitrary rotation-in-plane (RIP) on both images and video sequences, which is challenging because of large appearance variations of rotated faces under unconstrained scenarios. The problem becomes more formidable when the speed and memory efficiency of the detectors are taken into account. To solve this problem, we propose a Multi-Task Progressive Calibration Networks (MTPCN), which not only enjoy the natural advantage of explicit geometric structure representation, but also retain important cue to guide the feature learning for precise calibration. More concretely, our framework leverages a cascaded architecture with three stages of carefully designed convolutional neural networks to predict face and landmark location with gradually decreasing RIP ranges. In addition, we propose a novel loss by further integrating geometric information into penalization, which is much more reasonable than simply measuring the differences of training samples equally. MTPCN achieves significant performance improvement on the multi-oriented FDDB dataset and the rotation WIDER face dataset. Extensive experiments are performed, and demonstrate the effectiveness of our method for each of three tasks.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Farfade, S.S., Saberian, M.J., Li, L.J.: Multi-view face detection using deep convolutional neural networks. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp. 643–650 (2015)
Rowley, H.A., Baluja, S., Kanade, T.: Rotation invariant neural network-based face detection. In: Proceedings of the 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No. 98CB36231), pp. 38–44. IEEE (1998)
Zhang, K., Zhang, Z., Li, Z., et al.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)
Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)
Yang, S., Luo, P., Loy, C.C., et al.: Wider face: a face detection benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5525–5533 (2016)
Huang, C., Ai, H., Li, Y., et al.: High-performance rotation invariant multiview face detection. IEEE Trans. Pattern Anal. Mach. Intell. 29(4), 671–686 (2007)
Jain, V., Learned-Miller, E.: FDDB: a benchmark for face detection in unconstrained settings. UMass Amherst technical report (2010)
Li, H., Lin, Z., Shen, X., et al.: A convolutional neural network cascade for face detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5325–5334 (2015)
Shi, X., Shan, S., Kan, M., et al.: Real-time rotation-invariant face detection with progressive calibration networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2295–2303 (2018)
Cao, X., Wei, Y., Wen, F., Sun, J.: Face alignment by explicit shape regression. Int. J. Comput. Vision 107(2), 177–190 (2013). https://doi.org/10.1007/s11263-013-0667-3
Zhang, Z., Qiao, S., Xie, C., et al.: Single-shot object detection with enriched semantics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5813–5821 (2018)
Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2879–2886. IEEE (2012)
Yu, X., Huang, J., Zhang, S., et al.: Pose-free facial landmark fitting via optimized part mixtures and cascaded deformable shape model. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1944–1951 (2013)
Koestinger, M., Wohlhart, P., Roth, P.M., et al.: Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 2144–2151. IEEE (2011)
Wu, Y., Hassner, T., Kim, K.G., et al.: Facial landmark detection with tweaked convolutional neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 40(12), 3067–3074 (2017)
Huang, G.B., Ramesh, M., Berg, T., Learned-Miller, E., et al.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments. UMass, Amherst, MA, USA, Technical report 07–49, October 2007
Wang, Q., Zhang, L., Bertinetto, L., et al.: Fast online object tracking and segmentation: a unifying approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1328–1338 (2019)
Ranjan, R., Patel, V.M., Chellappa, R.: Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 121–135 (2017)
Xiong, X., De la Torre, F.: Supervised descent method and its applications to face alignment. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 532–539 (2013)
Liang, Z., Ding, S., Lin, L.: Unconstrained facial landmark localization with backbone-branches fully-convolutional networks. arXiv preprint arXiv:1507.03409 (2015)
Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Guo, X., Li, S., Zhang, J., et al.: PFLD: a practical facial landmark detector. arXiv preprint arXiv:1902.10859 (2019)
Burgos-Artizzu, X.P., Perona, P., Dollár, P.: Robust face landmark estimation under occlusion. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1513–1520 (2013)
Yang, B., Yang, C., Liu, Q., et al.: Joint rotation-invariance face detection and alignment with angle-sensitivity cascaded networks. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 1473–1480 (2019)
Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Facial landmark detection by deep multi-task learning. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 94–108. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_7
Zhou, X., Zhuo, J., Krahenbuhl, P.: Bottom-up object detection by grouping extreme and center points. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 850–859 (2019)
Dong, X., Yu, S.I., Weng, X., et al.: Supervision-by-registration: An unsupervised approach to improve the precision of facial landmark detectors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 360–368 (2018)
Liu, Z., Zhu, X., Hu, G., et al.: Semantic alignment: Finding semantically consistent ground-truth for facial landmark detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3467–3476 (2019)
Luxand Incorporated: Luxand face SDK. http://www.luxand.com/
Acknowledgment
This work was supported by the Science and Technology Research Program of Chongqing Municipal Education Commission (Grant No. KJZD-K201900601), by the National Natural Science Foundation of Chongqing (Grant No. cstc2019jcyj-msxmX0461), and by the Chongqing Overseas Scholars Innovation Program (Grant No. E020H2018013).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhou, LF., Gu, Y., Wang, P.S.P., Liu, FY., Liu, J., Xu, TY. (2020). Rotation-Invariant Face Detection with Multi-task Progressive Calibration Networks. In: Lu, Y., Vincent, N., Yuen, P.C., Zheng, WS., Cheriet, F., Suen, C.Y. (eds) Pattern Recognition and Artificial Intelligence. ICPRAI 2020. Lecture Notes in Computer Science(), vol 12068. Springer, Cham. https://doi.org/10.1007/978-3-030-59830-3_44
Download citation
DOI: https://doi.org/10.1007/978-3-030-59830-3_44
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59829-7
Online ISBN: 978-3-030-59830-3
eBook Packages: Computer ScienceComputer Science (R0)