Skip to main content
Log in

Facial expression synthesis based on similar faces

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Facial expression synthesis has several applications involving animation, human-computer interaction, entertainment, and training people with mental disorders. Facial expression synthesis aims to alter an image’s facial expression, usually by reenacting the facial movements from an example image in the target image. Deformation-based approaches usually choose the example image manually, leading to different results depending on this choice. This study differs from the literature by proposing and evaluating techniques that consider the similarity between facial images to choose the source image. The primary goal is to investigate the influence of selecting the source image in generated facial expressions of emotions. We propose three techniques for selecting similar faces in the facial expression synthesis pipeline and compare them to other approaches. We also compare the generated synthetic emotions with the results of recent methods from the literature by using objective metrics. Our findings suggest that one of the proposed techniques presented higher results in the search for similar faces and similar or better results for the synthesis when compared to the literature. Additionally, a visual analysis showed that similar faces can improve synthetic images’ realism, especially when compared to randomly selected facial images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Abboud B, Davoine F, Dang M (2004) Facial expression recognition and synthesis based on an appearance model. Signal Process Image Commun 19(8):723–740. https://doi.org/10.1016/j.image.2004.05.009

    Article  Google Scholar 

  2. Agarwal S, Chatterjee M, mukherjee DP (2012) Synthesis of emotional expressions specific to facial structure. In: Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP 12(28):1–28:8. ACM, New York, NY, USA. https://doi.org/10.1145/2425333.2425361

  3. Aifanti N, Papachristou C, Delopoulos A (2010) The mug facial expression database. In: 11th International Workshop on Image Analysis for Multimedia Interactive Services WIAMIS 10:1–4

  4. Averbuch-Elor H, Cohen-Or D, Kopf J, Cohen MF (2017) Bringing portraits to life. ACM Trans Graph 36(6):196:1–196:13. https://doi.org/10.1145/3130800.3130818

  5. Bailey DG (2011) Design for embedded image processing on FPGAs. John Wiley & Sons

  6. Bradski G (2000) The opencv library. Dr. Dobb’s J Softw Tools

  7. Cheng Y, Ling S (2008) 3d animated facial expression and autism in Taiwan. In: Advanced Learning Technologies, 2008. ICALT’08. Eighth IEEE International Conference on, pp. 17–19. IEEE

  8. Choi Y, Choi M, Kim M, Ha JW, Kim S, Choo J (2018) Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8789–8797. https://doi.org/10.1109/CVPR.2018.00916

  9. Deb D, Zhang J, Jain AK (2020) Advfaces: Adversarial face synthesis. In: 2020 IEEE International Joint Conference on Biometrics (IJCB), pp. 1–10. https://doi.org/10.1109/IJCB48548.2020.9304898

  10. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255

  11. Ding H, Sricharan K, Chellappa R (2018) Exprgan: Facial expression editing with controllable expression intensity. In: Thirty-Second AAAI Conference on Artificial Intelligence

  12. Ekman P, Friesen WV, Ellsworth P (1972) Emotion in the human face: Guidelines for research and an integration of findings. Pergamon Press, Oxford, England

    Google Scholar 

  13. Ekman P, Friesen WV, Hager JC (2002) Facs investigator’s guide. A Hum Face

  14. Everingham M, Eslami SM, Gool L, Williams CK, Winn J, Zisserman A (2015) The pascal visual object classes challenge: A retrospective. Int J Comput Vis 111(1):98–136. https://doi.org/10.1007/s11263-014-0733-5

  15. Fujishiro H, Suzuki T, Nakano S, Mejima A, Morishima S (2009) A natural smile synthesis from an artificial smile. In: SIGGRAPH ’09: Posters, SIGGRAPH ’09, pp. 59:1–59:1. ACM, New York, NY, USA. https://doi.org/10.1145/1599301.1599360

  16. Geng J, Shao T, Zheng Y, Weng Y, Zhou K (2018) Warp-guided gans for single-photo facial animation. ACM Trans Graph 37(6). https://doi.org/10.1145/3272127.3275043

  17. Ghent J, McDonald J (2005) Photo-realistic facial expression synthesis. Image Vis Comput 23(12), 1041–1050. https://doi.org/10.1016/j.imavis.2005.06.011

    Article  Google Scholar 

  18. Golan O, Baron-Cohen S (2006) Systemizing empathy: Teaching adults with asperger syndrome or high-functioning autism to recognize complex emotions using interactive multimedia. Dev Psychopathol 591–617. https://doi.org/10.1017/S0954579406060305

  19. Grynszpan O, Martin JC, Nadel J (2008) Multimedia interfaces for users with high functioning autism: An empirical investigation. International J Hum Comput Stud 66(8):628–639

    Article  Google Scholar 

  20. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. https://doi.org/10.1109/CVPR.2016.90

  21. Horn RA, Johnson CR (2012) Matrix Analysis, 2nd edn. Cambridge University Press, USA

    Book  Google Scholar 

  22. Izard CE (1971) The face of emotion. Appleton-Century-Crofts, East Norwalk, CT, US

    Google Scholar 

  23. Jian M, Cui C, Nie X, Zhang H, Nie L, Yin Y (2019) Multi-view face hallucination using svd and a mapping model. Inf Scie 488:181–189. https://doi.org/10.1016/j.ins.2019.03.026https://www.sciencedirect.com/science/article/pii/S0020025519302245

  24. Jian M, Lam K (2015) Simultaneous hallucination and recognition of low-resolution faces based on singular value decomposition. IEEE Transactions on Circuits and Systems for Video Technology 25(11):1761–1772. https://doi.org/10.1109/TCSVT.2015.2400772

    Article  Google Scholar 

  25. Jian M, Lam KM (2014) Face-image retrieval based on singular values and potential-field representation. Signal Process 100:9–15. https://doi.org/10.1016/j.sigpro.2014.01.004https://www.sciencedirect.com/science/article/pii/S0165168414000073

  26. Jian M, Lam KM, Dong J (2014) Facial-feature detection and localization based on a hierarchical scheme. Inf Sci 262:1–14. https://doi.org/10.1016/j.ins.2013.12.001https://www.sciencedirect.com/science/article/pii/S0020025513008451

  27. Kazemi V, Sullivan J (2014) One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR ’14, pp. 1867–1874. IEEE Computer Society, Washington, DC, USA. https://doi.org/10.1109/CVPR.2014.241

  28. King DE (2016) Dlib face detection dataset. http://dlib.net/. Accessed 25 Mar 2020

  29. King DE (2017) High quality face recognition with deep metric learning. http://blog.dlib.net/2017/02/high-quality-face-recognition-with-deep.html . [Online; Acessado em: 01 Oct 2018]

  30. Koestinger M, Wohlhart P, Roth PM, Bischof H (2011) Annotated Facial Landmarks in the Wild: A Large-scale, Real-world Database for Facial Landmark Localization. In: Proceedings First IEEE International Workshop on Benchmarking Facial Image Analysis Technologies

  31. Lahiri U, Bekele E, Dohrmann E, Warren Z, Sarkar N (2013) Design of a virtual reality based adaptive response technology for children with autism. IEEE Trans Neural Syst Rehab Eng 21(1):55–64

    Article  Google Scholar 

  32. Learned-Miller E, Huang GB, RoyChowdhury A, Li H, Hua G (2016) Labeled Faces in the Wild: A Survey, pp. 189–248. Springer International Publishing, Cham. https://doi.org/10.1007/978-3-319-25958-1_8

  33. Li K, Dai Q, Wang R, Liu Y, Xu F, Wang J (2014) A data-driven approach for facial expression retargeting in video. IEEE Trans Multimed 16(2):299–310. https://doi.org/10.1109/TMM.2013.2293064

  34. Li K, Xu F, Wang J, Dai Q, Liu Y (2012) A data-driven approach for facial expression synthesis in video. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 57–64. https://doi.org/10.1109/CVPR.2012.6247658

  35. Li X, Chang CC, Chang SK (2007) Face alive icon. J Vis Lang Comput 18(4):440–453. https://doi.org/10.1016/j.jvlc.2007.02.008

  36. Li Z, Zhu C, Gold C (2004) Digital terrain modeling: principles and methodology. CRC Press

  37. Liu Z, Shan Y, Zhang Z (2001) Expressive expression mapping with ratio images. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’01, pp. 271–276. ACM, New York, NY, USA. https://doi.org/10.1145/383259.383289

  38. Marčetić D, Soldić M, Ribarić S (2017) Hybrid cascade model for face detection in the wild based on normalized pixel difference and a deep convolutional neural network. In: M. Felsberg, A. Heyden, N. Krüger (eds.) Computer Analysis of Images and Patterns, pp. 379–390. Springer International Publishing, Cham

    Chapter  Google Scholar 

  39. Masi I, Wu Y, Hassner T, Natarajan P (2018) Deep face recognition: A survey. In: 2018 31st SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 471–478. https://doi.org/10.1109/SIBGRAPI.2018.00067

  40. Mena-Chalco J, Junior RC, Velho L (2008) Banco de dados de faces 3d: Impa-face3d. Tech Rep IMPA - RJ. http://app.visgraf.impa.br/database/faces/

  41. Mendi E, Bayrak C (2011) Facial animation framework for web and mobile platforms. In: 2011 IEEE 13th International Conference on e-Health Networking, Appl Serv 52–55. https://doi.org/10.1109/HEALTH.2011.6026785

  42. Mima D, Kubo H, Maejima A, Morishima S (2011) Automatic generation of facial wrinkles according to expression changes. In: SIGGRAPH Asia 2011 Posters, SA ’11, pp. 1:1–1:1. ACM, New York, NY, USA. https://doi.org/10.1145/2073304.2073306

  43. Moghadam SM, Seyyedsalehi SA (2018) Nonlinear analysis and synthesis of video images using deep dynamic bottleneck neural networks for face recognition. Neural Netw. https://doi.org/10.1016/j.neunet.2018.05.016

    Article  Google Scholar 

  44. Ng H, Winkler S (2014) A data-driven approach to cleaning large face datasets. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 343–347. https://doi.org/10.1109/ICIP.2014.7025068

  45. Noh JY, Neumann U (2001) Expression cloning. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’01, pp. 277–288. ACM, New York, NY, USA. https://doi.org/10.1145/383259.383290

  46. Otberdout N, Daoudi M, Kacem A, Ballihi L, Berretti S (2020) Dynamic facial expression generation on hilbert hypersphere with conditional wasserstein generative adversarial nets. IEEE Transactions on Pattern Analysis and Machine Intelligence pp. 1–1. https://doi.org/10.1109/TPAMI.2020.3002500

  47. Parkhi OM, Vedaldi A, Zisserman A (2015) Deep face recognition. Br Mach Vis Conf

  48. Pickering MJ, Rüger S (2003) Evaluation of key frame-based retrieval techniques for video. Comput Vis Image Underst 92(2–3):217–235. https://doi.org/10.1016/j.cviu.2003.06.002

  49. Sagonas C, Antonakos E, Tzimiropoulos G, Zafeiriou S, Pantic M (2016) 300 faces in-the-wild challenge: database and results. Image and Vision Computing 47:3–18. https://doi.org/10.1016/j.imavis.2016.01.002http://www.sciencedirect.com/science/article/pii/S0262885616000147. 300-W, the First Automatic Facial Landmark Detection in-the-Wild Challenge

  50. Seo M, Chen YW (2012) Two-step subspace learning for texture synthesis of facial images. In: 2012 6th International Conference on New Trends in Information Science and Service Science and Data Mining (ISSDM), pp. 483–486

  51. Song L, Lu Z, He R, Sun Z, Tan T (2018) Geometry guided adversarial facial expression synthesis. In: Proceedings of the 26th ACM International Conference on Multimedia, MM ’18, pp. 627–635. ACM, New York, NY, USA. https://doi.org/10.1145/3240508.3240612

  52. Testa RL, Corra CG, Machado-Lima A, Nunes FLS (2019) Synthesis of facial expressions in photographs: Characteristics, approaches, and challenges. ACM Comput Surv 51(6):124:1–124:35. https://doi.org/10.1145/3292652

  53. Testa RL, Machado-Lima A, Nunes FLS (2018) Factors influencing the perception of realism in synthetic facial expressions. In: 2018 31st SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 297–304. https://doi.org/10.1109/SIBGRAPI.2018.00045

  54. Thies J, Zollhöfer M, Nieundefinedner M (2019) Deferred neural rendering: Image synthesis using neural textures. ACM Trans Graph 38(4). https://doi.org/10.1145/3306346.3323035

  55. Tulyakov S, Liu MY, Yang X, Kautz J (2018) Mocogan: Decomposing motion and content for video generation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1526–1535. https://doi.org/10.1109/CVPR.2018.00165

  56. Udupa JK, LeBlanc VR, Zhuge Y, Imielinska C, Schmidt H, Currie LM, Hirsch BE, Woodburn J (2006) A framework for evaluating image segmentation algorithms. Comput Med Imaging Graph 30(2):75–87. https://doi.org/10.1016/j.compmedimag.2005.12.001

    Article  Google Scholar 

  57. Vondrick C, Pirsiavash H, Torralba A (2016) Generating videos with scene dynamics. In: D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, R. Garnett (eds.) Advn Neural Inf Process Syst 29

  58. Wang N, Gao X, Tao D, Yang H, Li X (2017) Facial feature point detection: A comprehensive survey. Neurocomputing. https://doi.org/10.1016/j.neucom.2017.05.013

    Article  Google Scholar 

  59. Wang Y, Bilinski P, Bremond F, Dantcheva A (2020) Imaginator: Conditional spatio-temporal gan for video generation. In: 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1149–1158.  https://doi.org/10.1109/WACV45572.2020.9093492

  60. Wang Z, Bovik AC (2006) Modern image quality assessment. Synth Lect Image Video Multimed Process 2(1):1–156

    Article  Google Scholar 

  61. Wei W, Tian C, Maybank SJ, Zhang Y (2016) Facial expression transfer method based on frequency analysis. Pattern Recognit 49:115–128. https://doi.org/10.1016/j.patcog.2015.08.004

    Article  Google Scholar 

  62. Xie W, Shen L, Jiang J (2017) A novel transient wrinkle detection algorithm and its application for expression synthesis. IEEE Transactions on Multimedia 19(2), 279–292. https://doi.org/10.1109/TMM.2016.2614429

    Article  Google Scholar 

  63. Xie W, Shen L, Yang M, Jiang J (2018) Facial expression synthesis with direction field preservation based mesh deformation and lighting fitting based wrinkle mapping. Multimed Tools Appl 77(6):7565–7593 . https://doi.org/10.1007/s11042-017-4661-6

  64. Xiong L, Zheng N, Du S, Wu L (2009) Extended facial expression synthesis using statistical appearance model. In: 2009 4th IEEE Conference on Industrial Electronics and Applications, pp. 1582–1587. https://doi.org/10.1109/ICIEA.2009.5138461

  65. Xiong L, Zheng N, Liu, J, Du S, Liu Y (2010) Eye synthesis using the eye curve model. Image Vis Comput 28(3):329–342. https://doi.org/10.1016/j.imavis.2009.06.001

    Article  Google Scholar 

  66. Xiong Z, Wu D, Gu W, Zhang H, Li B, Wang W (2020) Deep discrete attention guided hashing for face image retrieval. In: Proceedings of the 2020 International Conference on Multimedia Retrieval, ICMR ’20, p. 136-144. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3372278.3390683

  67. Yang S, Luo P, Loy CC, Tang X (2016) Wider face: A face detection benchmark. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  68. Zhang, H, Patel VM, Riggan BS, Hu S (2017) Generative adversarial network-based synthesis of visible faces from polarimetrie thermal faces. In: 2017 IEEE International Joint Conference on Biometrics (IJCB), pp. 100–107. https://doi.org/10.1109/BTAS.2017.8272687

  69. Zhou Y, Shi BE (2017) Photorealistic facial expression synthesis by the conditional difference adversarial autoencoder. In: 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 370–376. https://doi.org/10.1109/ACII.2017.8273626

Download references

Acknowledgements

The authors would like to thank the VISGRAF Lab at the Instituto Nacional de Matemática Pura e Aplicada and the Multimedia Understanding Group at the Aristotle University of Thessaloniki for providing the images used in this study. This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Finance Code 001, the Dean’s Office for Research of the University of São Paulo (process 18.5.245.86.7 – Intelligent psychiatric disorder classification system based on facial anthropometric measurements), Brazilian National Council of Scientific and Technological Development (CNPq) (Process 157535/2017-7) and São Paulo Research Foundation (FAPESP) (Process 14/50889-7): National Institute of Science and Technology – Medicine Assisted by Scientific Computing (INCT-MACC).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rafael Luiz Testa.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Testa, R.L., Machado-Lima, A. & Nunes, F.L.S. Facial expression synthesis based on similar faces. Multimed Tools Appl 80, 36465–36489 (2021). https://doi.org/10.1007/s11042-021-11525-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-11525-4

Keywords

Navigation