Personalized smile synthesis using attention-guided global parametric model and local non-parametric model

Tu, Ching-Ting; Hsieh, Sung-Hsien; Chen, Kuan-Lin; Lien, Jenn-Jier James

doi:10.1007/s11042-022-14260-6

Personalized smile synthesis using attention-guided global parametric model and local non-parametric model

Published: 03 December 2022

Volume 82, pages 21585–21609, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Ching-Ting Tu ORCID: orcid.org/0000-0003-2410-726X¹,
Sung-Hsien Hsieh²,
Kuan-Lin Chen¹ &
…
Jenn-Jier James Lien³

178 Accesses
1 Altmetric
Explore all metrics

A Correction to this article was published on 23 December 2022

This article has been updated

Abstract

This study proposes a new learning-based smile synthesis system, in which a given neutral facial image is automatically transferred as a smile result in a certain style. Although the example-based face synthesis framework has made great progress recently, the construction of robust transformation, the preservation of personal characteristics and the production of high-quality images, etc. remain unresolved problems. These questions are addressed in the proposed framework using a new expression attention-guided global parametric model and local non-parametric model. Our key innovations include (a) a flexible framework design that produces expression attention regions with only expression category labels as supervision, (b) a novel smile style analysis framework that explores different smile styles from training samples that are then used to guide more robust face modeling, and (c) a two-step expression transformation approach is proposed that integrates global parametric models for robust prediction of expression geometry and local non-parametric models for high-quality image generation. Experimental results show that in the case of a limited training data scenario, the facial images obtained using the proposed framework are more vivid than those generated using existing synthesis methods. In addition, the proposed method can be extended directly to the image-to-image transformation task to produce high-quality hallucinations of faces, which is very importance in digital entertainment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Personalized Expression Synthesis Using a Hybrid Geometric-Machine Learning Method

Facial expression synthesis with direction field preservation based mesh deformation and lighting fitting based wrinkle mapping

Article 08 April 2017

Specialized discriminators for style consistency in facial expression synthesis

Article 24 January 2024

Data availability

The data that support the findings of this study are available from the corresponding author, Ching-Ting Tu, upon reasonable request.

Change history

23 December 2022
A Correction to this paper has been published: https://doi.org/10.1007/s11042-022-14324-7

References

Bouaziz S, Pauly M (2014) Semi-Supervised Facial Animation Retargeting. EPFL Technical Report. #202143
Bozorgtabar B, Mahapatra D, Thiran J-P (2020) ExprADA: Adversarial Domain Adaptation for Facial Expression Analysis. Patt Recognit. 107111
Choi Y, Choi M-J, Kim M, Ha J-W, Kim S, Choo J (2018) StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. CVPR:8789–8797
Choi Y, Uh Y, Yoo J, Ha J-W (2020) StarGAN v2: diverse image synthesis for multiple domains. CVPR:8185–8194
Chowdhary CL, Patel PV, Kathrotia KJ, Attique M, Kumaresan P, Ijaz MF (2020) Analytical Study of Hybrid Techniques for Image Encryption and Decryption. Sensors. 5162
Deng Z, Neumann U, Lewis JP, Kim TY, Bulut M, Narayanan S (2006) Expressive facial animation synthesis by learning speech Coarticulation and expression spaces. IEEE Trans Vis Comput Graph 12:1523–1534
Article Google Scholar
Etoundi CML, Nkapkop JDD, Tsafack N, Ngono JM, Ele P, Wozniak M, Shafi J, Ijaz MF (2022) A Novel Compound-Coupled Hyperchaotic Map for Image Encryption. Symmetry. 493
Fan G-F, Zhang L-Z, Yu M, Hong W-C, Dong S-Q (2022) Applications of random forest in multivariable response surface for short-term load forecasting. Int J Electrical Power Energy Syst
Freeman WT, Pasztor EC (1999) Learning low-level vision. ICCV:1182–1189
Ghahramani Z, Hinton GE(1997) The EM Algorithm for Mixtures of Factor Analyzers. Technical Report CRG-TR-96-1
Gong B, Wang Y, Liu J, Tang X (2009) Automatic facial expression recognition on a dingle 3D face by exploring shape deformation. ACM Multimedia:569–572
Huang D, Torre FDL (2010) Bilinear Kernel Reduced Rank Regression for Facial Expression Synthesis. ECCV. 364–377
Huang L, Su C (2006) Facial expression synthesis using manifold learning and belief propagation. SoftComput:1193–1200
Kanade T, Cohn JF, Tian Y (2000) Comprehensive database for facial expression analysis. Int Conf Aut Face Gesture Recogn:46–53
Khan N, Akram A, Mahmood A, Ashraf S, Murtaza K (2020) Masked linear regression for learning local receptive fields for facial expression synthesis. Int J Comput Vis 128:1433–1454
Article Google Scholar
Li K, Dai Q, Wang R, Liu Y, Xu F, Wang J (2014) A data-driven approach for facial expression retargeting in video. IEEE Trans Multimedia 16:299–310
Article Google Scholar
Liu W, Chen W, Yang Z, Shen L (2021) Translate the facial regions you like using self-adaptive region translation. AAAI 35:2180–2188
Article Google Scholar
Lu Z, Hu T, Song L, Zhang Z, He R (2018) Conditional expression synthesis with face parsing transformation. ACM Multimedia:1083–1091
Mohammed U, Prince SJD, Kautz J (2009) Visio-lization: generating novel facial images SIGGRAPH
Noh JY, Neumann U(2006) Expression Cloning. ACM SIGGRAPH courses
Peng Y, Yin H (2019) ApprGAN: appearance-based GAN for facial expression synthesis. IET Image Process 13:2706–2715
Article Google Scholar
Pumarola A, Agudo A, Martínez AM, Sanfeliu A, Moreno-Noguer F (2020) GANimation: one-shot anatomically consistent facial animation. Int J Comput Vis 128:698–713
Article Google Scholar
Sahoo KK, Dutta I, Ijaz MF, Wozniak M, Singh PK (2021) TLEFuzzyNet: fuzzy rank-based Ensemble of Transfer Learning Models for emotion recognition from human speeches. IEEE Access 9:166518–166530
Article Google Scholar
Song Y, Bao L, Yang Q, Yang M-H (2014) Real-time Exemplar-based Face Sketch Synthesis. Proc. ECCV. pp. 800–813
Tamang J, Nkapkop JDD, Ijaz MF, Prasad PK, Tsafack N, Saha A, Kengne J, Son Y (2021) Dynamical properties of ion-acoustic waves in space plasma and its application to image encryption. IEEE Access 9:18762–18782
Article Google Scholar
Tang H, Liu H, Xu D, Torr PHS, Sebe N (2021) AttentionGAN: unpaired image-to-image translation using attention-guided generative adversarial networks. IEEE Trans Neural Networks Learn Syst
Torralba A, Murphy KP, Freeman WT (2007) Sharing visual features for multiclass and multi-view object detection. IEEE Trans Patt Anal Mach Intell 29:854–896
Article Google Scholar
Tran DL, Walecki RT, Rudovic O, Eleftheriadis S, Schuller BW, Pantic M (2017) DeepCoder: Semi-Parametric Variational Autoencoders for Automatic Facial Action Coding. ICCV. pp.3209–3218
Wang S, Gu XD, Qin H (2008) Automatic non-rigid registration of 3D dynamic data for facial expression. IEEE Conf Comput Vision Patt Recogn 2008:1–8
Google Scholar
Xia J, Quynh DTP, He Y, Chen X, Hoi SCH (2012) Modeling and compressing 3-D facial expressions using geometry videos. IEEE Trans Circ Syst Video Technol 22:77–90
Article Google Scholar
Xu W, Xie X, Lai J (2021) RelightGAN: instance-level generative adversarial network for face illumination transfer. IEEE Trans Image Process 30:3450–3460
Article Google Scholar
Yun T, Guan L (2013) A deformable 3-D facial expression model for dynamic human emotional state recognition. IEEE Trans Circ Syst Video Technol:142–157
Zhang Q, Liu Z, Quo G, Terzopoulos D, Shum HY (2006) Geometry-driven photorealistic facial expression synthesis. IEEE Trans Vis Comput Graph 12(1):48–60
Article Google Scholar
Zhang Y, Ji Q, Zhu Z, Yi B (2008) Dynamic facial expression analysis and synthesis with MPEG-4 facial animation parameters. IEEE Trans Circ Syst Video Technol 18:1383–1396
Article Google Scholar
Zhang F, Zhang T, Mao Q, Xu C (2020) Geometry guided pose-invariant facial expression recognition. IEEE Trans Image Process:4445–4460
Zhang F, Zhang T, Mao Q, Xu C (2020) A unified deep model for joint facial expression recognition, face synthesis, and face alignment. IEEE Trans Image Process 29:6574–6589
Article MATH Google Scholar
Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. ICCV:2242–2251

Download references

Acknowledgements

This work was supported by the Ministry of Science and Technology of Taiwan under Grant numbers MOST 109-2221-E-005 -056 -MY2. We thank anonymous reviewers for the insightful comments that improved this paper. We acknowledge all the authors for distributing the source code into the public domain and allowing us to use it as a basis for modifying it as comparison methods in this study.

Funding

This work was supported by the Ministry of Science and Technology of Taiwan under Grant numbers MOST 109-2221-E-005 -056 -MY2.

Author information

Authors and Affiliations

Department of Applied Mathematics, National Chung Hsing University, Taichung, 402, Taiwan
Ching-Ting Tu & Kuan-Lin Chen
Automation & Instrumentation System Development Sec., China Steel Corporation, Kaohsiung City, Taiwan
Sung-Hsien Hsieh
Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan City, Taiwan
Jenn-Jier James Lien

Authors

Ching-Ting Tu
View author publications
You can also search for this author inPubMed Google Scholar
Sung-Hsien Hsieh
View author publications
You can also search for this author inPubMed Google Scholar
Kuan-Lin Chen
View author publications
You can also search for this author inPubMed Google Scholar
Jenn-Jier James Lien
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Ching-Ting Tu.

Ethics declarations

Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: The affiliation of the 2nd and 3rd authors in the original publication of this article was incorrect.

Appendix

The detailed algorithm for the extraction of the expression-variant patch is shown in Algorithm 1, where smiling images and neutral images in the training set are taken as positive and negative samples, respectively. The goal of this algorithm is to extract a set of discriminative features to classify these positive and negative samples, where the facial regions of these extracted features are defined as expression-variant patches (EVP) in this study.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Tu, CT., Hsieh, SH., Chen, KL. et al. Personalized smile synthesis using attention-guided global parametric model and local non-parametric model. Multimed Tools Appl 82, 21585–21609 (2023). https://doi.org/10.1007/s11042-022-14260-6

Download citation

Received: 08 March 2022
Revised: 05 November 2022
Accepted: 08 November 2022
Published: 03 December 2022
Issue Date: June 2023
DOI: https://doi.org/10.1007/s11042-022-14260-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Personalized smile synthesis using attention-guided global parametric model and local non-parametric model

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Personalized Expression Synthesis Using a Hybrid Geometric-Machine Learning Method

Facial expression synthesis with direction field preservation based mesh deformation and lighting fitting based wrinkle mapping

Specialized discriminators for style consistency in facial expression synthesis

Data availability

Change history

23 December 2022

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now