Rendering Personalized Real-Time Expressions While Speaking Under a Mask

Hashimoto, Akira; Lu, Jun-Li; Ochiai, Yoichi

doi:10.1007/978-3-031-17618-0_5

Akira Hashimoto¹⁴,
Jun-Li Lu^14,15 &
Yoichi Ochiai^14,15

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13519))

Included in the following conference series:

International Conference on Human-Computer Interaction

1019 Accesses

Abstract

During COVID-19, people often wear masks in daily activities or communication. To solve the problem of generating faces with expressions under masks, we propose a framework of methods, including detecting the shape or locations of masked faces, generating the facial expressions under masks. Further, due to synthesizing quality facial expressions, we propose to optimize the merging of sub-results with useful face information such as key points of face. Further, we propose a framework for customization or personalization of user-preferring AI-generation results. We showed the system capable of running real-time and discussed the development in multiple aspects of research, interface, and applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Image-Based Approach for Generating Facial Expressions

Reconstructing Facial Expressions of HMD Users for Avatars in VR

Personalized Expression Synthesis Using a Hybrid Geometric-Machine Learning Method

References

Cherepkov, A., Voynov, A., Babenko, A.: Navigating the GAN parameter space for semantic image editing. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19–25, pp. 3671–3680. Computer Vision Foundation / IEEE (2021)
Google Scholar
Eskimez, S.E., Maddox, R.K., Xu, C., Duan, Z.: Noise-resilient training method for face landmark generation from speech. IEEE ACM Trans. Audio Speech Lang. Process. 28, 27–38 (2020)
Google Scholar
Glowacka, N., Ruminski, J.: Face with mask detection in thermal images using deep neural networks. Sensors 21(19), 6387 (2021)
Article Google Scholar
He, H., Zha, K., Katabi, D.: Indiscriminate poisoning attacks on unsupervised contrastive learning. CoRR abs/2202.11202 (2022)
Google Scholar
Hemathilaka, S., Aponso, A.: A comprehensive study on occlusion invariant face recognition under face mask occlusion. CoRR abs/2201.09089 (2022)
Google Scholar
Hong, J.H., et al: A 3D model-based approach for fitting masks to faces in the wild. CoRR abs/2103.00803 (2021)
Google Scholar
Hong, Y., Niu, L., Zhang, J., Zhao, W., Fu, C., Zhang, L.: F2GAN: fusing-and-filling GAN for few-shot image generation. In: Chen, C.W., Cucchiara, R., Hua, X., Qi, G., Ricci, E., Zhang, Z., Zimmermann, R. (eds.) MM ’20: The 28th ACM International Conference on Multimedia, Virtual Event / Seattle, WA, USA, October 12–16, pp. 2535–2543. ACM (2020)
Google Scholar
Khamlae, P., Sookhanaphibarn, K., Choensawat, W.: An application of deep-learning techniques to face mask detection during the COVID-19 pandemic. In: 3rd IEEE Global Conference on Life Sciences and Technologies, LifeTech 2021, Nara, Japan, March 9–11, pp. 298–299. IEEE (2021)
Google Scholar
Kim, D.H., Song, B.C.: Contrastive adversarial learning for person independent facial emotion recognition. In: Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 2–9, pp. 5948–5956. AAAI Press (2021)
Google Scholar
Koklu, M., Cinar, I., Taspinar, Y.S.: CNN-based bi-directional and directional long-short term memory network for determination of face mask. Biomed. Signal Process. Control. 71(Part), 103216 (2022)
Google Scholar
Lee, T., Yoo, S.: Augmenting few-shot learning with supervised contrastive learning. IEEE Access 9, 61466–61474 (2021)
Article Google Scholar
Liu, D., Nabail, M., Hertzmann, A., Kalogerakis, E.: Neural contours: Learning to draw lines from 3d shapes. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13–19, pp. 5427–5435. Computer Vision Foundation / IEEE (2020)
Google Scholar
Mallol-Ragolta, A., Liu, S., Schuller, B.W.: The filtering effect of face masks in their detection from speech. In: 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, EMBC 2021, Mexico, November 1–5, pp. 2079–2082. IEEE (2021)
Google Scholar
Mare, T., et al.: A realistic approach to generate masked faces applied on two novel masked face recognition data sets. CoRR abs/2109.01745 (2021)
Google Scholar
Martínez-Díaz, Y., Méndez-Vázquez, H., Luevano, L.S., Nicolás-Díaz, M., Chang, L., González-Mendoza, M.: Towards accurate and lightweight masked face recognition: an experimental evaluation. IEEE Access 10, 7341–7353 (2022)
Article Google Scholar
Peng, X., Wang, K., Zhu, Z., You, Y.: Crafting better contrastive views for siamese representation learning. CoRR abs/2202.03278 (2022)
Google Scholar
Richardson, E., et al.: Encoding in style: A stylegan encoder for image-to-image translation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19–25, pp. 2287–2296. Computer Vision Foundation / IEEE (2021)
Google Scholar
Roy, S., Etemad, A.: Spatiotemporal contrastive learning of facial expressions in videos. In: 9th International Conference on Affective Computing and Intelligent Interaction, ACII 2021, Nara, Japan, September 28 - Oct. 1, pp. 1–8. IEEE (2021)
Google Scholar
Soni, P.N., Shi, S., Sriram, P.R., Ng, A.Y., Rajpurkar, P.: Contrastive learning of heart and lung sounds for label-efficient diagnosis. Patterns 3(1), 100400 (2022)
Article Google Scholar
Vadlapati, J., S, S.V., Varghese, E.: Facial recognition using the opencv libraries of python for the pictures of human faces wearing face masks during the COVID-19 pandemic. In: 12th International Conference on Computing Communication and Networking Technologies, ICCCNT 2021, Kharagpur, India, July 6–8, 2021. pp. 1–5. IEEE (2021)
Google Scholar
Wang, H., Xiao, R., Li, Y., Feng, L., Niu, G., Chen, G., Zhao, J.: Pico: Contrastive label disambiguation for partial label learning. CoRR abs/2201.08984 (2022)
Google Scholar
Wang, J., et al.: Deep high-resolution representation learning for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 43(10), 3349–3364 (2021)
Article Google Scholar
Wang, Z., et al.: Masked face recognition dataset and application. CoRR abs/2003.09093 (2020)
Google Scholar
Wei, Y., et al.: MagGAN: high-resolution face attribute editing with mask-guided generative adversarial network. In: Ishikawa, H., Liu, C.-L., Pajdla, T., Shi, J. (eds.) ACCV 2020. LNCS, vol. 12625, pp. 661–678. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-69538-5_40
Chapter Google Scholar
Weng, L., Elsawah, A.M., Fang, K.: Cross-entropy loss for recommending efficient fold-over technique. J. Syst. Sci. Complex. 34(1), 402–439 (2021)
Article MathSciNet Google Scholar
Yang, Y., Guo, X.: Generative landmark guided face inpainting. In: Peng, Y., Liu, Q., Lu, H., Sun, Z., Liu, C., Chen, X., Zha, H., Yang, J. (eds.) PRCV 2020. LNCS, vol. 12305, pp. 14–26. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60633-6_2
Chapter Google Scholar
Yao, C., et al.: rPPG-based spoofing detection for face mask attack using efficientnet on weighted spatial-temporal representation. In: 2021 IEEE International Conference on Image Processing, ICIP 2021, Anchorage, AK, USA, September 19–22, pp. 3872–3876. IEEE (2021)
Google Scholar
Ye, M., Shen, J., Lin, G., Xiang, T., Shao, L., Hoi, S.C.H.: Deep learning for person re-identification: A survey and outlook. CoRR abs/2001.04193 (2020)
Google Scholar
Yu, C., Gao, C., Wang, J., Yu, G., Shen, C., Sang, N.: Bisenet V2: bilateral network with guided aggregation for real-time semantic segmentation. Int. J. Comput. Vis. 129(11), 3051–3068 (2021)
Article Google Scholar
Zheng, Z., Zheng, L., Yang, Y.: Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22–29, pp. 3774–3782. IEEE Computer Society (2017)
Google Scholar

Download references

Acknowledgement

This work was supported by University of Tsukuba (Basic Research Support Program Type A).

Author information

Authors and Affiliations

Research and Development Center for Digital Nature, University of Tsukuba, Tsukuba, Japan
Akira Hashimoto, Jun-Li Lu & Yoichi Ochiai
Faculty of Library, Information and Media Science, University of Tsukuba, Tsukuba, Japan
Jun-Li Lu & Yoichi Ochiai

Authors

Akira Hashimoto
View author publications
You can also search for this author in PubMed Google Scholar
Jun-Li Lu
View author publications
You can also search for this author in PubMed Google Scholar
Yoichi Ochiai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun-Li Lu .

Editor information

Editors and Affiliations

The Open University of Japan, Chiba, Japan
Masaaki Kurosu
Tokyo University of Science, Tokyo, Saitama, Japan
Sakae Yamamoto
Tokyo City University, Tokyo, Japan
Hirohiko Mori
Soar Technology Inc., Orlando, FL, USA
Dylan D. Schmorrow
Katmai Government Services, Orlando, FL, USA
Cali M. Fidopiastis
Smart Future Initiative, Frankfurt am Main, Germany
Norbert A. Streitz
Kyushu University, Fukuoka, Japan
Shin'ichi Konomi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hashimoto, A., Lu, JL., Ochiai, Y. (2022). Rendering Personalized Real-Time Expressions While Speaking Under a Mask. In: Kurosu, M., et al. HCI International 2022 - Late Breaking Papers. Multimodality in Advanced Interaction Environments. HCII 2022. Lecture Notes in Computer Science, vol 13519. Springer, Cham. https://doi.org/10.1007/978-3-031-17618-0_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-17618-0_5
Published: 02 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17617-3
Online ISBN: 978-3-031-17618-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Rendering Personalized Real-Time Expressions While Speaking Under a Mask