Rotation-Invariant Face Detection with Multi-task Progressive Calibration Networks

Zhou, Li-Fang; Gu, Yu; Wang, Patrick S. P.; Liu, Fa-Yuan; Liu, Jie; Xu, Tian-Yu

doi:10.1007/978-3-030-59830-3_44

Rotation-Invariant Face Detection with Multi-task Progressive Calibration Networks

Conference paper
First Online: 09 October 2020

1450 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12068))

Abstract

Rotation-invariant face detection (RIFD) aims to detect faces with arbitrary rotation-in-plane (RIP) on both images and video sequences, which is challenging because of large appearance variations of rotated faces under unconstrained scenarios. The problem becomes more formidable when the speed and memory efficiency of the detectors are taken into account. To solve this problem, we propose a Multi-Task Progressive Calibration Networks (MTPCN), which not only enjoy the natural advantage of explicit geometric structure representation, but also retain important cue to guide the feature learning for precise calibration. More concretely, our framework leverages a cascaded architecture with three stages of carefully designed convolutional neural networks to predict face and landmark location with gradually decreasing RIP ranges. In addition, we propose a novel loss by further integrating geometric information into penalization, which is much more reasonable than simply measuring the differences of training samples equally. MTPCN achieves significant performance improvement on the multi-oriented FDDB dataset and the rotation WIDER face dataset. Extensive experiments are performed, and demonstrate the effectiveness of our method for each of three tasks.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Farfade, S.S., Saberian, M.J., Li, L.J.: Multi-view face detection using deep convolutional neural networks. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp. 643–650 (2015)
Google Scholar
Rowley, H.A., Baluja, S., Kanade, T.: Rotation invariant neural network-based face detection. In: Proceedings of the 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No. 98CB36231), pp. 38–44. IEEE (1998)
Google Scholar
Zhang, K., Zhang, Z., Li, Z., et al.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)
Article Google Scholar
Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)
Google Scholar
Yang, S., Luo, P., Loy, C.C., et al.: Wider face: a face detection benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5525–5533 (2016)
Google Scholar
Huang, C., Ai, H., Li, Y., et al.: High-performance rotation invariant multiview face detection. IEEE Trans. Pattern Anal. Mach. Intell. 29(4), 671–686 (2007)
Article Google Scholar
Jain, V., Learned-Miller, E.: FDDB: a benchmark for face detection in unconstrained settings. UMass Amherst technical report (2010)
Google Scholar
Li, H., Lin, Z., Shen, X., et al.: A convolutional neural network cascade for face detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5325–5334 (2015)
Google Scholar
Shi, X., Shan, S., Kan, M., et al.: Real-time rotation-invariant face detection with progressive calibration networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2295–2303 (2018)
Google Scholar
Cao, X., Wei, Y., Wen, F., Sun, J.: Face alignment by explicit shape regression. Int. J. Comput. Vision 107(2), 177–190 (2013). https://doi.org/10.1007/s11263-013-0667-3
Article MathSciNet Google Scholar
Zhang, Z., Qiao, S., Xie, C., et al.: Single-shot object detection with enriched semantics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5813–5821 (2018)
Google Scholar
Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2879–2886. IEEE (2012)
Google Scholar
Yu, X., Huang, J., Zhang, S., et al.: Pose-free facial landmark fitting via optimized part mixtures and cascaded deformable shape model. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1944–1951 (2013)
Google Scholar
Koestinger, M., Wohlhart, P., Roth, P.M., et al.: Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 2144–2151. IEEE (2011)
Google Scholar
Wu, Y., Hassner, T., Kim, K.G., et al.: Facial landmark detection with tweaked convolutional neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 40(12), 3067–3074 (2017)
Article Google Scholar
Huang, G.B., Ramesh, M., Berg, T., Learned-Miller, E., et al.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments. UMass, Amherst, MA, USA, Technical report 07–49, October 2007
Google Scholar
Wang, Q., Zhang, L., Bertinetto, L., et al.: Fast online object tracking and segmentation: a unifying approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1328–1338 (2019)
Google Scholar
Ranjan, R., Patel, V.M., Chellappa, R.: Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 121–135 (2017)
Article Google Scholar
Xiong, X., De la Torre, F.: Supervised descent method and its applications to face alignment. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 532–539 (2013)
Google Scholar
Liang, Z., Ding, S., Lin, L.: Unconstrained facial landmark localization with backbone-branches fully-convolutional networks. arXiv preprint arXiv:1507.03409 (2015)
Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Google Scholar
Guo, X., Li, S., Zhang, J., et al.: PFLD: a practical facial landmark detector. arXiv preprint arXiv:1902.10859 (2019)
Burgos-Artizzu, X.P., Perona, P., Dollár, P.: Robust face landmark estimation under occlusion. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1513–1520 (2013)
Google Scholar
Yang, B., Yang, C., Liu, Q., et al.: Joint rotation-invariance face detection and alignment with angle-sensitivity cascaded networks. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 1473–1480 (2019)
Google Scholar
Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Facial landmark detection by deep multi-task learning. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 94–108. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_7
Chapter Google Scholar
Zhou, X., Zhuo, J., Krahenbuhl, P.: Bottom-up object detection by grouping extreme and center points. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 850–859 (2019)
Google Scholar
Dong, X., Yu, S.I., Weng, X., et al.: Supervision-by-registration: An unsupervised approach to improve the precision of facial landmark detectors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 360–368 (2018)
Google Scholar
Liu, Z., Zhu, X., Hu, G., et al.: Semantic alignment: Finding semantically consistent ground-truth for facial landmark detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3467–3476 (2019)
Google Scholar
Luxand Incorporated: Luxand face SDK. http://www.luxand.com/

Download references

Acknowledgment

This work was supported by the Science and Technology Research Program of Chongqing Municipal Education Commission (Grant No. KJZD-K201900601), by the National Natural Science Foundation of Chongqing (Grant No. cstc2019jcyj-msxmX0461), and by the Chongqing Overseas Scholars Innovation Program (Grant No. E020H2018013).

Author information

Authors and Affiliations

College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
Li-Fang Zhou & Yu Gu
College of Software Engineering, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
Li-Fang Zhou, Jie Liu & Tian-Yu Xu
Northeastern University, Boston, MA, 02115, USA
Patrick S. P. Wang
Petrochina Chongqing Marketing Company, Chongqing, 400065, China
Fa-Yuan Liu

Authors

Li-Fang Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yu Gu
View author publications
You can also search for this author in PubMed Google Scholar
Patrick S. P. Wang
View author publications
You can also search for this author in PubMed Google Scholar
Fa-Yuan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jie Liu
View author publications
You can also search for this author in PubMed Google Scholar
Tian-Yu Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li-Fang Zhou .

Editor information

Editors and Affiliations

East China Normal University, Shanghai, China
Yue Lu
Paris Descartes University, Paris, France
Nicole Vincent
Hong Kong Baptist University, Kowloon, Hong Kong
Pong Chi Yuen
Sun Yat-sen University, Guangzhou, China
Wei-Shi Zheng
Polytechnique Montréal, Montreal, QC, Canada
Farida Cheriet
Concordia University, Montreal, QC, Canada
Ching Y. Suen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, LF., Gu, Y., Wang, P.S.P., Liu, FY., Liu, J., Xu, TY. (2020). Rotation-Invariant Face Detection with Multi-task Progressive Calibration Networks. In: Lu, Y., Vincent, N., Yuen, P.C., Zheng, WS., Cheriet, F., Suen, C.Y. (eds) Pattern Recognition and Artificial Intelligence. ICPRAI 2020. Lecture Notes in Computer Science(), vol 12068. Springer, Cham. https://doi.org/10.1007/978-3-030-59830-3_44

Download citation

DOI: https://doi.org/10.1007/978-3-030-59830-3_44
Published: 09 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59829-7
Online ISBN: 978-3-030-59830-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics