Skip to main content

Improving RGB-D Face Recognition via Transfer Learning from a Pretrained 2D Network

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12093))

Abstract

2D Face recognition has been extensively studied for decades and has reached remarkable results in recent years. However, 2D Face recognition is sensitive to variations in poses, facial expressions and illuminations. Depth images provide valuable information to help model facial boundaries and understand the global facial layout and provide low frequency patterns. Intuitively, RGB-D images are more robust to external environments than RGB images. Unfortunately, RGB-D datasets are orders of magnitude smaller than 2D datasets and insufficient to train a deep CNN model as effective as RGB-based models. To tackle these challenges, we present an RGB-D ResNet50 model which can be transferred from a pretrained RGB model and takes RGB-D images as input. We achieved an accuracy of 94.64% and won the \(1^{\text {st}}\) place on 3D Face Recognition Algorithm Challenge, 2019 BenchCouncil International Artificial Intelligence System Challenges.

The source code is available at https://github.com/xingwxiong/Face3D-Pytorch.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Cao, C., Weng, Y., Zhou, S., Tong, Y., Zhou, K.: Facewarehouse: a 3D facial expression database for visual computing. IEEE Trans. Vis. Comput. Graph. 20(3), 413–425 (2013)

    Google Scholar 

  2. Chen, M., Chen, T., Chen, Q.: An efficient implementation of the ALS-WR algorithm on x86 CPUS. In: Gao, W., et al. (eds.) International Symposium on Benchmarking, Measuring and Optimization (Bench 2019). LNCS, vol. 12093, pp. 116–122. Springer, Heidelberg (2020)

    Google Scholar 

  3. Cheng, S., Marras, I., Zafeiriou, S., Pantic, M.: Statistical non-rigid ICP algorithm and its application to 3D face alignment. Image Vis. Comput. 58, 3–12 (2017)

    Article  Google Scholar 

  4. Cui, J., Zhang, H., Han, H., Shan, S., Chen, X.: Improving 2D face recognition via discriminative face depth estimation. In: 2018 International Conference on Biometrics (ICB), pp. 140–147. IEEE (2018)

    Google Scholar 

  5. Deng, W., Wang, P., Wang, J., Li, C., Guo, M.: PSL: exploiting parallelism, sparsity and locality to accelerate matrix factorization on x86 platforms. In: Gao, W., et al. (eds.) International Symposium on Benchmarking, Measuring and Optimization (Bench 2019). LNCS, vol. 12093, pp. 101–109. Springer, Heidelberg (2020)

    Google Scholar 

  6. Faltemier, T.C., Bowyer, K.W., Flynn, P.J.: Using a multi-instance enrollment representation to improve 3D face recognition. In: 2007 First IEEE International Conference on Biometrics: Theory, Applications, and Systems, pp. 1–6. IEEE (2007)

    Google Scholar 

  7. Gao, W., et al.: AIBench: towards scalable and comprehensive datacenter AI benchmarking. In: Zheng, C., Zhan, J. (eds.) Bench 2018. LNCS, vol. 11459, pp. 3–9. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32813-9_1

    Chapter  Google Scholar 

  8. Gao, W., et al.: AIBench: an industry standard internet service AI benchmark suite. arXiv preprint arXiv:1908.08998 (2019)

  9. Gong, T., Huiqian, N.: An implementation of resnet on the classification of RGB-D images. In: Gao, W., et al. (eds.) International Symposium on Benchmarking, Measuring and Optimization (Bench 2019). LNCS, vol. 12093, pp. 149–155. Springer, Heidelberg (2020)

    Google Scholar 

  10. Han, J., Chen, H., Liu, N., Yan, C., Li, X.: Cnns-based RGB-D saliency detection via cross-view transfer and multiview fusion. IEEE Trans. Cybern. 48(11), 3171–3183 (2017)

    Article  Google Scholar 

  11. Hao, T., et al.: Edge AIBench: towards comprehensive end-to-end edge computing benchmarking. In: Zheng, C., Zhan, J. (eds.) Bench 2018. LNCS, vol. 11459, pp. 23–30. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32813-9_3

    Chapter  Google Scholar 

  12. Hao, T., Zheng, Z.: The implementation and optimization of matrix decomposition based collaborative filtering task on x86 platform. In: Gao, W., et al. (eds.) International Symposium on Benchmarking, Measuring and Optimization (Bench 2019). LNCS, vol. 12093, pp. 110–115. Springer, Heidelberg (2020)

    Google Scholar 

  13. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016

    Google Scholar 

  14. Hou, P., Yu, J., Miao, Y., Tai, Y., Wu, Y., Zhao, C.: RVtensor: a light-weight neural network inference framework based on the RISC-V architecture. In: Gao, W., et al. (eds.) International Symposium on Benchmarking, Measuring and Optimization (Bench19). LNCS, vol. 12093, pp. 85–90. Springer, Heidelberg (2020)

    Google Scholar 

  15. Huang, G.B., Mattar, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments (2008)

    Google Scholar 

  16. Jia, C., Kong, Y., Ding, Z., Fu, Y.R.: Latent tensor transfer learning for RGB-D action recognition. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 87–96. ACM (2014)

    Google Scholar 

  17. Jiang, Z., et al.: HPC AI500: a benchmark suite for HPC AI systems. In: Zheng, C., Zhan, J. (eds.) Bench 2018. LNCS, vol. 11459, pp. 10–22. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32813-9_2

    Chapter  Google Scholar 

  18. Keselman, L., Woodfill, J.I., Grunnet-Jepsen, A., Bhowmik, A.: Intel(R) realsense(TM) stereoscopic depth cameras. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1267–1276. IEEE (2017)

    Google Scholar 

  19. Kim, D., Hernandez, M., Choi, J., Medioni, G.: Deep 3D face identification. In: 2017 IEEE International Joint Conference on Biometrics (IJCB), pp. 133–142. IEEE (2017)

    Google Scholar 

  20. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  21. Li, G., Wang, X., Ma, X., Liu, L., Feng, X.: XDN: towards efficient inference of residual neural networks on cambricon chips. In: Gao, W., et al. (eds.) International Symposium on Benchmarking, Measuring and Optimization (Bench 2019). LNCS, vol. 12093, pp. 51–56. Springer, Heidelberg (2020)

    Google Scholar 

  22. Li, J., Jiang, Z.: Performance analysis of cambricon MLU100. In: Gao, W., et al. (eds.) International Symposium on Benchmarking, Measuring and Optimization (Bench 2019). LNCS, vol. 12093, pp. 57–66. Springer, Heidelberg (2020)

    Google Scholar 

  23. Luo, C., et al.: AIoT bench: towards comprehensive benchmarking mobile and embedded device intelligence. In: Zheng, C., Zhan, J. (eds.) Bench 2018. LNCS, vol. 11459, pp. 31–35. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32813-9_4

    Chapter  Google Scholar 

  24. Maze, B., et al.: IARPA janus benchmark-C: face dataset and protocol. In: 2018 International Conference on Biometrics (ICB), pp. 158–165. IEEE (2018)

    Google Scholar 

  25. Moreno, A.: GavabDB: a 3D face database. In: Proceedings of 2nd COST275 Workshop on Biometrics on the Internet 2004, pp. 75–80 (2004)

    Google Scholar 

  26. Nech, A., Kemelmacher-Shlizerman, I.: Level playing field for million scale face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7044–7053 (2017)

    Google Scholar 

  27. Phillips, P.J., et al.: Overview of the face recognition grand challenge. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 947–954. IEEE (2005)

    Google Scholar 

  28. Savran, A., et al.: Bosphorus database for 3D face analysis. In: Schouten, B., Juul, N.C., Drygajlo, A., Tistarelli, M. (eds.) BioID 2008. LNCS, vol. 5372, pp. 47–56. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-89991-4_6

    Chapter  Google Scholar 

  29. Schwarz, M., Schulz, H., Behnke, S.: RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 1329–1335. IEEE (2015)

    Google Scholar 

  30. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  31. Song, X., Herranz, L., Jiang, S.: Depth CNNS for RGB-D scene recognition: learning from scratch better than transferring from RGB-CNNS. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)

    Google Scholar 

  32. Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

    Google Scholar 

  33. Wang, L., et al.: BigDataBench: a big data benchmark suite from internet services. In: 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA), pp. 488–499. IEEE (2014)

    Google Scholar 

  34. Wang, Y., Zeng, C., Li, C.: Exploring the performance bound of cambricon accelerator in end-to-end inference scenario. In: Gao, W., et al. (eds.) International Symposium on Benchmarking, Measuring and Optimization (Bench 2019). LNCS, vol. 12093, pp. 67–74. Springer, Heidelberg (2020)

    Google Scholar 

  35. Yi, D., Lei, Z., Liao, S., Li, S.Z.: Learning face representation from scratch. arXiv preprint arXiv:1411.7923 (2014)

  36. Yin, L., Wei, X., Sun, Y., Wang, J., Rosato, M.J.: A 3D facial expression database for facial behavior research. In: 7th International Conference on Automatic Face and Gesture Recognition (FGR06), pp. 211–216. IEEE (2006)

    Google Scholar 

  37. Zhang, J., Huang, D., Wang, Y., Sun, J.: Lock3Dface: a large-scale database of low-cost kinect 3D faces. In: 2016 International Conference on Biometrics (ICB), pp. 1–8. IEEE (2016)

    Google Scholar 

  38. Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)

    Article  Google Scholar 

  39. Zhang, Z.: Microsoft kinect sensor and its effect. IEEE Multimed. 19(2), 4–10 (2012)

    Article  Google Scholar 

  40. Zulqarnain Gilani, S., Mian, A.: Learning from millions of 3D scans for large-scale 3D face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1896–1905 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xingwang Xiong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xiong, X., Wen, X., Huang, C. (2020). Improving RGB-D Face Recognition via Transfer Learning from a Pretrained 2D Network. In: Gao, W., Zhan, J., Fox, G., Lu, X., Stanzione, D. (eds) Benchmarking, Measuring, and Optimizing. Bench 2019. Lecture Notes in Computer Science(), vol 12093. Springer, Cham. https://doi.org/10.1007/978-3-030-49556-5_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-49556-5_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-49555-8

  • Online ISBN: 978-3-030-49556-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics