ABSTRACT
Emotion generation has remained a challenging task due to the high similarity between each emotion class. In addition, the model had to learn images with various lighting conditions and diverse facial structures. To address this limitation, we propose a modification of StarGAN by applying differentiable augmentation for generating realistic facial emotions. Furthermore, our approach allows both the generator and discriminator to generalize the data better. Finally, we evaluate the performance of the model through an emotion classifier and conduct a quantitative analysis by calculating the accuracy of the generated emotion.
- J. M. Saragih, S. Lucey, and J. F. Cohn, “Real-time avatar animation from a single image,” 2011 IEEE Int. Conf. Autom. Face Gesture Recognit. Work. FG 2011, pp. 117–124, 2011, doi: 10.1109/FG.2011.5771383.Google Scholar
- J. Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks,” Proc. IEEE Int. Conf. Comput. Vis., vol. 2017-Octob, pp. 2242–2251, 2017, doi: 10.1109/ICCV.2017.244.Google ScholarCross Ref
- X. Xia, F. Yu, N. Li, Y. Qu, J. Zhang, and C. Zhu, “Self-attention-masking semantic decomposition and segmentation for facial attribute manipulation,” IEEE Access, vol. 8, pp. 36154–36165, 2020, doi: 10.1109/ACCESS.2020.2974239.Google ScholarCross Ref
- H. Ding, K. Sricharan, and R. Chellappa, “ExprGAN: Facial expression editing with controllable expression intensity,” 32nd AAAI Conf. Artif. Intell. AAAI 2018, pp. 6781–6788, 2018.Google ScholarCross Ref
- A. Pumarola, A. Agudo, A. M. Martinez, A. Sanfeliu, and F. Moreno-Noguer, “GANimation: One-Shot Anatomically Consistent Facial Animation,” Int. J. Comput. Vis., vol. 128, no. 3, pp. 698–713, 2020, doi: 10.1007/s11263-019-01210-3.Google ScholarCross Ref
- X. Wang, J. Gong, M. Hu, Y. Gu, and F. Ren, “LAUN Improved StarGAN for Facial Emotion Recognition,” IEEE Access, vol. 8, pp. 161509–161518, 2020, doi: 10.1109/access.2020.3021531.Google ScholarCross Ref
- Y. Choi, M. Choi, M. Kim, J. W. Ha, S. Kim, and J. Choo, “StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 8789–8797, 2018, doi: 10.1109/CVPR.2018.00916.Google ScholarCross Ref
- S. Zhao, Z. Liu, J. Lin, J. Y. Zhu, and S. Han, “Differentiable Augmentation for Data-Efficient GAN Training,” arXiv, no. NeurIPS, 2020.Google Scholar
- P. Isola, J. Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” Proc. - 30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017, vol. 2017-Janua, pp. 5967–5976, 2017, doi: 10.1109/CVPR.2017.632.Google ScholarCross Ref
- M. Mobini and F. Ghaderi, “StarGAN Based Facial Expression Transfer for Anime Characters,” 2020 25th Int. Comput. Conf. Comput. Soc. Iran, CSICC 2020, pp. 0–4, 2020, doi: 10.1109/CSICC49403.2020.9050061.Google ScholarCross Ref
- Z. He, W. Zuo, M. Kan, S. Shan, and X. Chen, “AttGAN: Facial Attribute Editing by only Changing What You Want,” IEEE Trans. Image Process., vol. 28, no. 11, pp. 5464–5478, 2019, doi: 10.1109/TIP.2019.2916751.Google ScholarDigital Library
- I. J. Goodfellow , “Generative Adversarial Nets,” 2014, doi: 10.1109/ICCVW.2019.00369.Google Scholar
- H. Zhang, I. Goodfellow, D. Metaxas, and A. Odena, “Self-attention generative adversarial networks,” 36th Int. Conf. Mach. Learn. ICML 2019, vol. 2019-June, pp. 12744–12753, 2019.Google Scholar
- A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” 4th Int. Conf. Learn. Represent. ICLR 2016 - Conf. Track Proc., pp. 1–16, 2016.Google Scholar
- P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-To-Image Translation With Conditional Adversarial Networks,” 2017, doi: 10.1007/978-3-030-11009-3_37.Google ScholarDigital Library
- C. Ledig , “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network,” 2017, [Online]. Available: http://openaccess.thecvf.com/content_cvpr_2017/papers/Ledig_Photo-Realistic_Single_Image_CVPR_2017_paper.pdf.Google ScholarCross Ref
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” Commun. ACM, vol. 60, no. 6, pp. 84–90, 2017, doi: 10.1145/3065386.Google ScholarDigital Library
- L. Perez and J. Wang, “The Effectiveness of Data Augmentation in Image Classification using Deep Learning,” 2017, [Online]. Available: http://arxiv.org/abs/1712.04621.Google Scholar
- H. Zhang, Z. Zhang, A. Odena, and H. Lee, “Consistency Regularization for Generative Adversarial Networks,” in International Conference on Learning Representations, 2020, pp. 1–19.Google Scholar
- Z. Zhao, Z. Zhang, T. Chen, S. Singh, and H. Zhang, “Image Augmentations for GAN Training,” 2020, [Online]. Available: http://arxiv.org/abs/2006.02595.Google Scholar
- P. Viola, O. M. Way, and M. J. Jones, “Robust Real-Time Face Detection,” Int. J. Comput. Vis., vol. 57, no. 2, pp. 137–154, 2004.Google ScholarDigital Library
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-Decem, pp. 770–778, 2016, doi: 10.1109/CVPR.2016.90.Google ScholarCross Ref
- D. Y. Liliana, T. Basaruddin, and I. I. D. Oriza, “The Indonesian Mixed Emotion Dataset (IMED): A facial expression dataset for mixed emotion recognition,” ACM Int. Conf. Proceeding Ser., pp. 56–60, 2018, doi: 10.1145/3293663.3293671.Google ScholarDigital Library
- N. Aifanti, C. Papachristou, and A. Delopoulos, “The MUG Facial Expression Database,” 11th Int. Work. Image Anal. Multimed. Interact. Serv. WIAMIS 10, pp. 1–4, 2010.Google Scholar
- X. Zhang , “BP4D-Spontaneous: A high-resolution spontaneous 3D dynamic facial expression database,” Image Vis. Comput., vol. 32, no. 10, pp. 692–706, 2014, doi: 10.1016/j.imavis.2014.06.002.Google ScholarCross Ref
- M. Mavadati, P. Sanger, and M. H. Mahoor, “Extended DISFA Dataset: Investigating Posed and Spontaneous Facial Expressions,” IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Work., pp. 1452–1459, 2016, doi: 10.1109/CVPRW.2016.182.Google ScholarCross Ref
- D. P. Kingma and J. L. Ba, “Adam: A method for stochastic optimization,” 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc., pp. 1–15, 2015.Google Scholar
Index Terms
- Facial Emotion Generation using StarGAN with Differentiable Augmentation
Recommendations
Do all facial emojis communicate emotion? The impact of facial emojis on perceived sender emotion and text processing
AbstractFacial emojis can express a variety of positive and negative emotions, and are commonly used in digital, written communication. However, little is known about how emojis impact text processing and how different emoji-text combinations ...
Highlights- We investigated how facial emojis impact text comprehension using behavioral ratings and EEG.
Human-Computer Interaction Using Emotion Recognition from Facial Expression
EMS '11: Proceedings of the 2011 UKSim 5th European Symposium on Computer Modeling and SimulationThis paper describes emotion recognition system based on facial expression. A fully automatic facial expression recognition system is based on three steps: face detection, facial characteristic extraction and facial expression classification. We have ...
Facial component-based blended facial expressions generation from static neutral face images
Facial expression synthesis is getting a wide-spread attention since past several years due to its multimedia applications. In most of the earlier research works, example images of target expressions are required to produce synthesized facial ...
Comments