Skip to main content
Log in

Unsupervised masked face inpainting based on contrastive learning and attention mechanism

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Masked face inpainting, aiming to restore realistic facial details and complete textures, remains a challenging task. In this paper, an unsupervised masked face inpainting method based on contrastive learning and attention mechanism is proposed. First, to overcome the constraint of a paired training dataset, a contrastive learning network framework is constructed by comparing features extracted from inpainted face image patches with those from input masked face image patches. Subsequently, to extract more effective facial features, a feature attention module is designed, which can focus on the significant feature information and establish long-range dependency relationships. In addition, a PatchGAN-based discriminator is refined with spectral normalization to enhance the stability of training the proposed network and guide the generator in producing more realistic face images. Numerous experiment results indicate that our approach can obtain better masked face inpainting results than the comparison approaches overall in terms of both subjective and objective evaluations, as well as face recognition accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability statement

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Boutros, F., Damer, N., Kirchbuchner, F., et al.: Self-restrained triplet loss for accurate masked face recognition. Pattern Recogn. 124, 108473 (2022)

    Article  Google Scholar 

  2. Li, X., Shao, C., Zhou, Y., et al.: Face mask removal based on generative adversarial network and texture network. 2021 4th International Conference on Robotics, Control and Automation Engineering (RCAE). IEEE 86–89. (2021)

  3. Ma, X., Zhou, X., Huang, H., et al.: Contrastive attention network with dense field estimation for face completion. Pattern Recogn. 124, 108465 (2022)

    Article  Google Scholar 

  4. Kumar, V., Mukherjee, J., Mandal, S.K.D.: Image inpainting through metric labeling via guided patch mixing. IEEE Trans. Image Process. 25(11), 5212–5226 (2016)

    Article  MathSciNet  Google Scholar 

  5. Li, S., Zhu, C., Sun, M.T.: Hole filling with multiple reference views in DIBR view synthesis. IEEE Trans. Multim. 20(8), 1948–1959 (2018)

    Article  Google Scholar 

  6. Nguyen, T.D., Kim, B., Hong, M.C.: New hole-filling method using extrapolated spatio-temporal background information for a synthesized free-view. IEEE Trans. Multim. 21(6), 1345–1358 (2019)

    Article  Google Scholar 

  7. Zhuang, Y., Wang, Y., Shih, T.K., et al.: Patch-guided facial image inpainting by shape propagation. J. Zhejiang Univ.-Sci. A 10(2), 232–238 (2009)

    Article  Google Scholar 

  8. Wang, Z., M., Tao, J. H.: Reconstruction of partially occluded face by fast recursive PCA. 2007 International Conference on Computational Intelligence and Security Workshops (CISW 2007). IEEE. 304–307 (2007)

  9. Deng, Y., Dai, Q., Zhang, Z.: Graph Laplace for occluded face completion and recognition. IEEE Trans. Image Process. 20(8), 2329–2338 (2011)

    Article  MathSciNet  Google Scholar 

  10. Modak, G., Das, S.S., Miraj, M.A.I., et al.: A deep learning framework to reconstruct face under mask. 2022 7th International Conference on Data Science and Machine Learning Applications (CDMA). IEEE. 200–205 (2022)

  11. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: ‘Generative adversarial nets.’ Proc. Adv. Neural Inf. Process. Syst. 27, 2672–2680 (2014)

    Google Scholar 

  12. Lin, C.T., Huang, S.W., Wu, Y.Y., et al.: GAN-based day-to-night image style transfer for nighttime vehicle detection. IEEE Trans. Intell. Transp. Syst. 22(2), 951–963 (2020)

    Article  Google Scholar 

  13. Jiang, Y., Xu, J., Yang, B., et al.: Image inpainting based on generative adversarial networks. IEEE Access 8, 22884–22892 (2020)

    Article  Google Scholar 

  14. Wan, W., Yang, Y., Huang, S., et al.: FRAN: feature-filtered residual attention network for realistic face sketch-to-photo transformation. Appl. Intell. 53, 15946–15956 (2022)

    Article  Google Scholar 

  15. Farahanipad, F., Rezaei, M., Nasr, M. et al.: GAN-based face reconstruction for masked-face. Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments. 583–587. (2022)

  16. Chen, G., Zhang, G., Yang, Z., et al.: Multi-scale patch-GAN with edge detection for image inpainting. Appl. Intell. 53(4), 3917–3932 (2023)

    Article  Google Scholar 

  17. Yu, J., Lin, Z., Yang, J., et al.: Free-form image inpainting with gated convolution. Proceedings of the IEEE/CVF International Conference on Computer Vision. 4471–4480 (2019)

  18. Zhang, X., Wang, X., Shi, C., et al.: De-gan: Domain embedded gan for high quality face image inpainting. Pattern Recogn. 124, 108415 (2022)

    Article  Google Scholar 

  19. He, L., Qiang, Z., Shao, X., et al.: Research on high-resolution face image inpainting method based on StyleGAN. Electronics 11(10), 1620 (2022)

    Article  Google Scholar 

  20. Ma, B., An, X., Sun, N.: Face image inpainting algorithm via progressive generation network. 2020 IEEE 5th International Conference on Signal and Image Processing (ICSIP). IEEE. 175–179 (2020)

  21. Guo, X., Yang, H., Huang, D.: Image inpainting via conditional texture and structure dual generation. Proceedings of the IEEE/CVF International Conference on Computer Vision. 14134–14143. (2021)

  22. Wang, Q., Fan, H., Sun, G., et al.: Recurrent generative adversarial network for face completion. IEEE Trans. Multimedia 23, 429–442 (2020)

    Article  Google Scholar 

  23. Fang, Y., Li, Y., Tu, X., et al.: Face completion with hybrid dilated convolution. Signal Process.: Image Commun. 80, 115664 (2020)

    Google Scholar 

  24. Chen, T., Kornblith, S., Norouzi, M., et al.: A simple framework for contrastive learning of visual representations. International Conference on Machine Learning. PMLR, 1597–1607. (2020)

  25. Dyer, C.: Notes on noise contrastive estimation and negative sampling. arXiv preprint arXiv:1410.8251, (2014)

  26. Oord, A., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, (2018)

  27. Lin, Y., Zhang, S., Chen, T., et al.: Exploring negatives in contrastive learning for unpaired image-to-image translation. Proceedings of the 30th ACM International Conference on Multimedia. 1186–1194. (2022)

  28. Jung, C., Kwon, G., Ye, J.C.: Exploring patch-wise semantic relation for contrastive learning in image-to-image translation tasks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18260–18269. (2022)

  29. Chen, X., Pan, J., Jiang, K., et al.: Unpaired deep image deraining using dual contrastive learning. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2017–2026. (2022)

  30. Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. Adv. Neural Inform. Process. Syst., 2017, 30.

  31. Xiao, Z., Li, D.: Generative image inpainting by hybrid contextual attention network. In: MultiMedia Modeling 27th International Conference, MMM 2021, Prague, Czech Republic Republic, June 22–24, pp. 162–173. Springer International Publishing, Cham (2021)

    Google Scholar 

  32. Qin, J., Bai, H., Zhao, Y.: Multi-scale attention network for image inpainting. Comput. Vis. Image Underst. 204, 103155 (2021)

    Article  Google Scholar 

  33. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7132–7141 (2018)

  34. Qin, X., Wang, Z., Bai, Y., et al.: FFA-Net: Feature fusion attention network for single image dehazing. Proceed AAAI Conf. Artif. Intell. 34(07), 11908–11915 (2020)

    Google Scholar 

  35. Park, T., Efros, A.A., Zhang, R., et al.: Contrastive learning for unpaired image-to-image translation. In: Computer Vision–ECCV 2020 16th European Conference, Glasgow, UK, August 23–28, pp. 319–345. Springer International Publishing, Cham (2020)

    Chapter  Google Scholar 

  36. Isola, P., Zhu, J.Y., Zhou, T., et al.: Image-to-image translation with conditional adversarial networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1125–1134. (2017)

  37. Zhu, J.Y., Park, T., Isola, P., et al.: Unpaired image-to-image translation using cycle-consistent adversarial networks. Proceedings of the IEEE International Conference on Computer Vision. 2223–2232. (2017)

  38. Fu, H., Gong, M., Wang, C., et al.: Geometry-consistent generative adversarial networks for one-sided unsupervised domain mapping. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2427–2436. (2019)

  39. Zhao, Y., Wu, R., Dong, H.: Unpaired image-to-image translation using adversarial consistency loss. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, pp. 800–815. Springer International Publishing, Cham (2020)

    Chapter  Google Scholar 

  40. Han, J., Shoeiby, M., Petersson, L., et al.: Dual contrastive learning for unsupervised image-to-image translation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 746–755. (2021)

  41. Xie, S., Xu, Y., Gong, M., et al.: Unpaired image-to-image translation with shortest path regularization. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10177–10187. (2023)

  42. Dong, J., Wang, W., Tan, T,.: Casia image tampering detection evaluation database. 2013 IEEE China Summit and International Conference on Signal and Information Processing. IEEE. 422–426. (2013)

  43. Huang, G.B, Mattar, M., Berg, T., et al.: Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Workshop on faces in 'Real-Life' Images: detection, alignment, and recognition. (2008)

  44. Anwar, A., Raychowdhury, A.: Masked face recognition for secure authentication. arXiv preprint arXiv:2008.11104, (2020)

  45. Liu, H., Jiang, B., Xiao, Y., et al.: Coherent semantic attention for image inpainting. Proceedings of the IEEE/CVF International Conference on Computer Vision. Pp. 4170–4179 (2019)

  46. Wang, Z., Bovik, A.C., Sheikh, H.R., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)

    Article  Google Scholar 

  47. Zhang, R., Isola, P., Efros A.A., et al.: The unreasonable effectiveness of deep features as a perceptual metric. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 586–595 (2018)

  48. Cao, Q., Shen, L., Xie, W., et al.: Vggface2: A dataset for recognising faces across pose and age//2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018). IEEE. 67–74 (2018)

  49. Zhou, B., Lapedriza, A., Khosla, A., et al.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452–1464 (2017)

    Article  Google Scholar 

Download references

Acknowledgements

This study has been supported in part by the National Natural Science Foundation of China (62261025, 62362032, and 62262023), by the Natural Science Foundation of Jiangxi Province (20232BAB212015), and by the Xizang Autonomous Region Science and Technology Plan (XZ202303ZY0005G).

Author information

Authors and Affiliations

Authors

Contributions

W. Wan and S. Chen wrote the main manuscript text, Y. Zhang built the Methodology, and L. Yao prepared the figures and tables. All authors reviewed the manuscript.

Corresponding author

Correspondence to Weiguo Wan.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Communicated by Ting Yao.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wan, W., Chen, S., Yao, L. et al. Unsupervised masked face inpainting based on contrastive learning and attention mechanism. Multimedia Systems 30, 209 (2024). https://doi.org/10.1007/s00530-024-01411-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00530-024-01411-y

Keywords