Skip to main content

Rotation-Invariant Face Detection with Multi-task Progressive Calibration Networks

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12068))

Abstract

Rotation-invariant face detection (RIFD) aims to detect faces with arbitrary rotation-in-plane (RIP) on both images and video sequences, which is challenging because of large appearance variations of rotated faces under unconstrained scenarios. The problem becomes more formidable when the speed and memory efficiency of the detectors are taken into account. To solve this problem, we propose a Multi-Task Progressive Calibration Networks (MTPCN), which not only enjoy the natural advantage of explicit geometric structure representation, but also retain important cue to guide the feature learning for precise calibration. More concretely, our framework leverages a cascaded architecture with three stages of carefully designed convolutional neural networks to predict face and landmark location with gradually decreasing RIP ranges. In addition, we propose a novel loss by further integrating geometric information into penalization, which is much more reasonable than simply measuring the differences of training samples equally. MTPCN achieves significant performance improvement on the multi-oriented FDDB dataset and the rotation WIDER face dataset. Extensive experiments are performed, and demonstrate the effectiveness of our method for each of three tasks.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Farfade, S.S., Saberian, M.J., Li, L.J.: Multi-view face detection using deep convolutional neural networks. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp. 643–650 (2015)

    Google Scholar 

  2. Rowley, H.A., Baluja, S., Kanade, T.: Rotation invariant neural network-based face detection. In: Proceedings of the 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No. 98CB36231), pp. 38–44. IEEE (1998)

    Google Scholar 

  3. Zhang, K., Zhang, Z., Li, Z., et al.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)

    Article  Google Scholar 

  4. Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)

    Google Scholar 

  5. Yang, S., Luo, P., Loy, C.C., et al.: Wider face: a face detection benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5525–5533 (2016)

    Google Scholar 

  6. Huang, C., Ai, H., Li, Y., et al.: High-performance rotation invariant multiview face detection. IEEE Trans. Pattern Anal. Mach. Intell. 29(4), 671–686 (2007)

    Article  Google Scholar 

  7. Jain, V., Learned-Miller, E.: FDDB: a benchmark for face detection in unconstrained settings. UMass Amherst technical report (2010)

    Google Scholar 

  8. Li, H., Lin, Z., Shen, X., et al.: A convolutional neural network cascade for face detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5325–5334 (2015)

    Google Scholar 

  9. Shi, X., Shan, S., Kan, M., et al.: Real-time rotation-invariant face detection with progressive calibration networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2295–2303 (2018)

    Google Scholar 

  10. Cao, X., Wei, Y., Wen, F., Sun, J.: Face alignment by explicit shape regression. Int. J. Comput. Vision 107(2), 177–190 (2013). https://doi.org/10.1007/s11263-013-0667-3

    Article  MathSciNet  Google Scholar 

  11. Zhang, Z., Qiao, S., Xie, C., et al.: Single-shot object detection with enriched semantics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5813–5821 (2018)

    Google Scholar 

  12. Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2879–2886. IEEE (2012)

    Google Scholar 

  13. Yu, X., Huang, J., Zhang, S., et al.: Pose-free facial landmark fitting via optimized part mixtures and cascaded deformable shape model. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1944–1951 (2013)

    Google Scholar 

  14. Koestinger, M., Wohlhart, P., Roth, P.M., et al.: Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 2144–2151. IEEE (2011)

    Google Scholar 

  15. Wu, Y., Hassner, T., Kim, K.G., et al.: Facial landmark detection with tweaked convolutional neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 40(12), 3067–3074 (2017)

    Article  Google Scholar 

  16. Huang, G.B., Ramesh, M., Berg, T., Learned-Miller, E., et al.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments. UMass, Amherst, MA, USA, Technical report 07–49, October 2007

    Google Scholar 

  17. Wang, Q., Zhang, L., Bertinetto, L., et al.: Fast online object tracking and segmentation: a unifying approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1328–1338 (2019)

    Google Scholar 

  18. Ranjan, R., Patel, V.M., Chellappa, R.: Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 121–135 (2017)

    Article  Google Scholar 

  19. Xiong, X., De la Torre, F.: Supervised descent method and its applications to face alignment. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 532–539 (2013)

    Google Scholar 

  20. Liang, Z., Ding, S., Lin, L.: Unconstrained facial landmark localization with backbone-branches fully-convolutional networks. arXiv preprint arXiv:1507.03409 (2015)

  21. Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)

    Google Scholar 

  22. Guo, X., Li, S., Zhang, J., et al.: PFLD: a practical facial landmark detector. arXiv preprint arXiv:1902.10859 (2019)

  23. Burgos-Artizzu, X.P., Perona, P., Dollár, P.: Robust face landmark estimation under occlusion. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1513–1520 (2013)

    Google Scholar 

  24. Yang, B., Yang, C., Liu, Q., et al.: Joint rotation-invariance face detection and alignment with angle-sensitivity cascaded networks. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 1473–1480 (2019)

    Google Scholar 

  25. Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Facial landmark detection by deep multi-task learning. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 94–108. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_7

    Chapter  Google Scholar 

  26. Zhou, X., Zhuo, J., Krahenbuhl, P.: Bottom-up object detection by grouping extreme and center points. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 850–859 (2019)

    Google Scholar 

  27. Dong, X., Yu, S.I., Weng, X., et al.: Supervision-by-registration: An unsupervised approach to improve the precision of facial landmark detectors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 360–368 (2018)

    Google Scholar 

  28. Liu, Z., Zhu, X., Hu, G., et al.: Semantic alignment: Finding semantically consistent ground-truth for facial landmark detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3467–3476 (2019)

    Google Scholar 

  29. Luxand Incorporated: Luxand face SDK. http://www.luxand.com/

Download references

Acknowledgment

This work was supported by the Science and Technology Research Program of Chongqing Municipal Education Commission (Grant No. KJZD-K201900601), by the National Natural Science Foundation of Chongqing (Grant No. cstc2019jcyj-msxmX0461), and by the Chongqing Overseas Scholars Innovation Program (Grant No. E020H2018013).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li-Fang Zhou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhou, LF., Gu, Y., Wang, P.S.P., Liu, FY., Liu, J., Xu, TY. (2020). Rotation-Invariant Face Detection with Multi-task Progressive Calibration Networks. In: Lu, Y., Vincent, N., Yuen, P.C., Zheng, WS., Cheriet, F., Suen, C.Y. (eds) Pattern Recognition and Artificial Intelligence. ICPRAI 2020. Lecture Notes in Computer Science(), vol 12068. Springer, Cham. https://doi.org/10.1007/978-3-030-59830-3_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59830-3_44

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59829-7

  • Online ISBN: 978-3-030-59830-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics