Skip to main content

Light-Weight Distilled HRNet for Facial Landmark Detection

  • Conference paper
  • First Online:
Pattern Recognition. ICPR International Workshops and Challenges (ICPR 2021)

Abstract

A light-weight facial landmark detection model is proposed in this paper (we named it “LDHRNet”), which can be trained in an end-to-end fashion and could perform precise facial landmark detection in various conditions including those with large pose, exaggerated expression, non-uniform lighting and occlusions. Firstly, in order to deal with these challenging cases above, a light-weight HRNet (LHRNet) structure is proposed as the backbone while the bottleneck block is used to replace the standard residual block in the original HRNet and the group convolution is used to replace the standard convolution in the original HRNet. Then in order to prevent the accuracy loss by the coordinates quantization, we use function named dual soft argmax (DSA) to map the heatmap response to final coordinates. And then we proposed Similarity-FeatureMap knowledge distillation model which guides the training of a student network such that input pairs that produce similar (dissimilar) feature maps in the pre-trained teacher network produce similar (dissimilar) feature maps in the student network. Finally, we combine the distillation loss and NME loss to train our model. The best result 79.10% for AUC is achieved on the validation set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  2. Howard, A., et al.: Searching for mobilenetv3. In: ICCV (2019)

    Google Scholar 

  3. Köstinger, M., Wohlhart, P., Roth, P.M., Bischof, H.: Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: ICCVW (2011)

    Google Scholar 

  4. Lai, S., Chai, Z., Li, S., Meng, H., Yang, M., Wei, X.: Enhanced normalized mean error loss for robust facial landmark detection. In: BMVC (2019)

    Google Scholar 

  5. Lai, S., Chai, Z., Wei, X.: Improved hourglass structure for high performance facial landmark detection. In: ICMEW (2019)

    Google Scholar 

  6. Liu, Y., et al.: Grand challenge of 106-point facial landmark localization. In: ICMEW (2019)

    Google Scholar 

  7. Masi, I., Tran, A.T., Hassner, T., Leksut, J.T., Medioni, G.: Do we really need to collect millions of faces for effective face recognition? In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 579–596. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_35

    Chapter  Google Scholar 

  8. Nibali, A., He, Z., Morgan, S., Prendergast, L.: Numerical coordinate regression with convolutional neural networks. CoRR (2018)

    Google Scholar 

  9. Park, W., Kim, D., Lu, Y., Cho, M.: Relational knowledge distillation. In: CVPR (2019)

    Google Scholar 

  10. Ren, S., Cao, X., Wei, Y., Sun, J.: Face alignment at 3000 FPS via regressing local binary features. In: CVPR (2014)

    Google Scholar 

  11. Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: CVPR (2019)

    Google Scholar 

  12. Sun, X., Xiao, B., Wei, F., Liang, S., Wei, Y.: Integral human pose regression. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 536–553. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_33

    Chapter  Google Scholar 

  13. Ting, Z., Guo-Jun, Q., Bin, X., Jingdong, W.: Interleaved group convolutions for deep neural networks. In: ICCV (2017)

    Google Scholar 

  14. Tung, F., Mori, G.: Similarity-preserving knowledge distillation. In: ICCV (2019)

    Google Scholar 

  15. Wu, Y., Hassner, T., Kim, K., Medioni, G.G., Natarajan, P.: Facial landmark detection with tweaked convolutional neural networks. IEEE TPAMI 40(12), 3067–3074 (2018)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shenqi Lai .

Editor information

Editors and Affiliations

A Part Localization Results

A Part Localization Results

Part localization results on the validation set have been shown in Fig. 4.

Fig. 4.
figure 4

Part localization results of out method on the validation set.

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tong, Z., Lai, S., Chai, Z. (2021). Light-Weight Distilled HRNet for Facial Landmark Detection. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12665. Springer, Cham. https://doi.org/10.1007/978-3-030-68821-9_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-68821-9_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-68820-2

  • Online ISBN: 978-3-030-68821-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics