research-article

Facial Emotion Recognition Based on Optimized Xception Training

Authors:

Xuan LaiAuthors Info & Claims

BDIOT '24: Proceedings of the 2024 8th International Conference on Big Data and Internet of Things

Pages 12 - 18

https://doi.org/10.1145/3697355.3697358

Published: 12 December 2024 Publication History

Get Access

Abstract

This scholarly investigation is centered on the pursuit of high precision in the domain of facial emotion recognition (FER). The conventional approaches to feature extraction and classification within the realm of ER are often laborious and heavily reliant on the quality of manual feature extraction, which can introduce significant variability and subjectivity into the process. In an effort to address these limitations and enhance the efficiency and accuracy of ER, this study has made a deliberate selection of the FER2013 dataset, a benchmark for evaluating ER models.

Leveraging the Xception model as a foundational framework, this research implements a transfer learning strategy with a pre-trained model to bolster the quality of the training model. The Xception architecture, known for its depth-wise separable convolutions, allows for a more efficient learning process and is adept at capturing intricate features from visual data. By utilizing a pre-trained model, the study capitalizes on the knowledge garnered from vast and diverse datasets, thus enabling a more robust and generalized feature representation for the task at hand.

Furthermore, to augment the model’s performance on the FER2013 dataset, this study employs data augmentation techniques. These methods introduce controlled variability into the training data by applying random transformations such as cropping, scaling, and flipping, which serve to increase the dataset’s diversity and prevent overfitting. The data augmentation process aims to simulate a broader range of real-world conditions and expressions, thereby improving the model’s ability to generalize and perform accurately across various emotional states.

In the comprehensive comparative analysis conducted within this study, the optimized and fine-tuned Xception model is pitted against other prominent deep learning architectures that are mainstream in the current literature. The performance of the Xception model is evaluated on various metrics, including but not limited to accuracy, precision, recall, and F1 score. The results of this rigorous evaluation reveal that the Xception model, after undergoing the proposed enhancements, surpasses the accuracy of the compared networks, demonstrating its superiority in the context of the FER2013 dataset.

The findings of this study contribute to the growing body of research that advocates for the adoption of deep learning methodologies in the field of computer vision, specifically in the nuanced task of facial emotion recognition. The outcomes underscore the potential of transfer learning and data augmentation in enhancing the performance of deep neural networks, offering a promising avenue for future research and development in the field. This study not only pushes the boundaries of what is achievable in automated facial emotion recognition but also presents a solid foundation for the integration of such models into practical applications, where the accurate interpretation of human emotions can significantly improve user experience and interaction in various sectors, including but not limited to healthcare, customer service, and human-computer interaction.

References

[1]

Albert Mehrabian and Norman Epstein. A measure of emotional empathy. Journal of personality, 1972.

Google Scholar

[2]

Qian Li. Research and application of deep learning models for facial emotion recognition. Master’s thesis, Xiangtan University, 2019.

Google Scholar

[3]

Jingjing Wu. Emotion recognition technology based on deep learning. Master’s thesis, University of Chinese Academy of Sciences (Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences), 2020.

Google Scholar

[4]

Ying Pan. A review of emotion recognition. Computer Knowledge and Technology, 14(08):169–171, 2018.

Google Scholar

[5]

Léon Bottou, Corinna Cortes, John S Denker, Harris Drucker, Isabelle Guyon, Larry D Jackel, Yann LeCun, Urs A Muller, Edward Sackinger, Patrice Simard, et al. Comparison of classifier methods: a case study in handwritten digit recognition. In Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3-Conference C: Signal Processing (Cat. No. 94CH3440-5), volume 2, pages 77–82. IEEE, 1994.

Google Scholar

[6]

Ian J Goodfellow, Dumitru Erhan, Pierre Luc Carrier, Aaron Courville, Mehdi Mirza, Ben Hamner, Will Cukierski, Yichuan Tang, David Thaler, Dong-Hyun Lee, et al. Challenges in representation learning: A report on three machine learning contests. In Neural information processing: 20th international conference, ICONIP 2013, daegu, korea, november 3-7, 2013. Proceedings, Part III 20, pages 117–124. Springer, 2013.

Google Scholar

[7]

Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:https://arXiv.org/abs/1409.1556, 2014.

Google Scholar

[8]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.

Google Scholar

[9]

Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. Mobilenets: efficient convolutional neural networks for mobile vision applications (2017). arXiv preprint arXiv:https://arXiv.org/abs/1704.04861, 126, 2017.

Google Scholar

[10]

François Chollet. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1251–1258, 2017.

Google Scholar

[11]

Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2818–2826, 2016.

Google Scholar

Index Terms

Facial Emotion Recognition Based on Optimized Xception Training
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

Facial Emotion Recognition with Varying Poses and/or Partial Occlusion Using Multi-stage Progressive Transfer Learning
Image Analysis
Abstract
This paper describes the use of multi-stage Progressive Transfer Learning (MSPTL) to improve the performance of automated Facial Emotion Recognition (FER). Our proposed FER solution is designed to work with 2D images, and is able to classify ...
Extended deep neural network for facial emotion recognition
Highlights
- A new Deep Fully Connected model for facial emotion recognition.
- The model ...
Abstract
Humans use facial expressions to show their emotional states. However, facial expression recognition has remained a challenging and interesting problem in computer vision. In this paper we present our approach which is the extension of ...
Facial emotion recognition of deaf and hard-of-hearing students for engagement detection using deep learning
Abstract
Nowadays, facial expression recognition (FER) has drawn considerable attention from the research community in various application domains due to the recent advancement of deep learning. In the education field, facial expression recognition has the ...

Comments

Information & Contributors

Information

Published In

BDIOT '24: Proceedings of the 2024 8th International Conference on Big Data and Internet of Things

September 2024

412 pages

ISBN:9798400717529

DOI:10.1145/3697355

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 December 2024

Check for updates

Author Tags

Qualifiers

Research-article

Conference

BDIOT 2024

BDIOT 2024: 2024 8th International Conference on Big Data and Internet of Things

September 14 - 16, 2024

Macau, China

Acceptance Rates

Overall Acceptance Rate 75 of 136 submissions, 55%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
20
Total Downloads

Downloads (Last 12 months)20
Downloads (Last 6 weeks)9

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Abstract

References

Index Terms

Recommendations

Facial Emotion Recognition with Varying Poses and/or Partial Occlusion Using Multi-stage Progressive Transfer Learning

Extended deep neural network for facial emotion recognition

Facial emotion recognition of deaf and hard-of-hearing students for engagement detection using deep learning

Comments

Information

Published In

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

Full Text

Share

Share this Publication link

Share on social media

Affiliations