Abstract
Face possesses a rich spatial structure that can provide valuable cues to guide various face-related tasks. The eyes are considered an important socio-visual cue for effective communication. They are an integral feature of facial expressions as they are an important aspect of interpersonal communication. However, virtual reality headsets occlude a significant portion of the face and restrict the visibility of certain facial features, particularly the eye region. Reproducing this region with realistic content and handling complex eye movements such as blinks is challenging. Previous facial inpainting methods are not capable enough to capture subtle eye movements. In view of this, we propose a working solution to refine the reconstructions, particularly around the eye region, by leveraging inherent eye structure. We introduce spatial supervision and a novel landmark predictor module to regularize per-frame reconstructions obtained from an existing image-based facial de-occlusion network. Experiments verify the usefulness of our approach in enhancing the quality of reconstructions to capture subtle eye movements.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Browatzki, B., Wallraven, C.: 3FabRec: fast few-shot face alignment by reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020)
Grishchenko, I., Ablavatski, A., Kartynnik, Y., Raveendran, K., Grundmann, M.: Attention mesh: high-fidelity face mesh prediction in real-time (2020)
Gupta, S., Shetty, A., Sharma, A.: Attention based occlusion removal for hybrid telepresence systems. In: 19th Conference on Robots and Vision (CRV) (2022)
Horé, A., Ziou, D.: Image quality metrics: PSNR vs. SSIM. In: ICPR (2010)
Nazeri, K., Ng, E., Joseph, T., Qureshi, F.Z., Ebrahimi, M.: Edgeconnect: generative image inpainting with adversarial edge learning (2019)
Numan, N., ter Haar, F., Cesar, P.: Generative RGB-D face completion for head-mounted display removal. In: 2021 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW). IEEE (2021)
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks (2016)
Thies, J., Zollöfer, M., Stamminger, M., Theobalt, C., Nießner, M.: FaceVR: Real-Time Facial Reenactment and Eye Gaze Control in Virtual Reality (2016)
Wang, M., Wen, X., Hu, S.M.: Faithful face image completion for HMD occlusion removal. In: 2019 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct). IEEE (2019)
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13, 600–612 (2004)
Wu, Y., Singh, V., Kapoor, A.: From image to video face inpainting: spatial-temporal nested GAN (STN-GAN) for usability recovery. In: 2020 IEEE Winter Conference on Applications of Computer Vision (2020)
Yang, Y., Guo, X., Ma, J., Ma, L., Ling, H.: LaFIn: Generative landmark guided face inpainting (2019)
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextual attention (2018)
Zhang, J., et al.: Unsupervised high-resolution portrait gaze correction and animation. IEEE Trans. Image Process. 31, 1572–1586 (2022)
Zhang, J., et al.: Dual in-painting model for unsupervised gaze correction and animation in the wild. In: ACM MM (2020)
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
Zhao, Y., et al.: Mask-off: synthesizing face images in the presence of head-mounted displays. In: 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR) (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Gupta, S., Jinka, S.S., Sharma, A., Namboodiri, A. (2023). Supervision by Landmarks: An Enhanced Facial De-occlusion Network for VR-Based Applications. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13805. Springer, Cham. https://doi.org/10.1007/978-3-031-25072-9_21
Download citation
DOI: https://doi.org/10.1007/978-3-031-25072-9_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25071-2
Online ISBN: 978-3-031-25072-9
eBook Packages: Computer ScienceComputer Science (R0)