skip to main content
10.1145/3697355.3697358acmotherconferencesArticle/Chapter ViewAbstractPublication PagesbdiotConference Proceedingsconference-collections
research-article

Facial Emotion Recognition Based on Optimized Xception Training

Published: 12 December 2024 Publication History

Abstract

This scholarly investigation is centered on the pursuit of high precision in the domain of facial emotion recognition (FER). The conventional approaches to feature extraction and classification within the realm of ER are often laborious and heavily reliant on the quality of manual feature extraction, which can introduce significant variability and subjectivity into the process. In an effort to address these limitations and enhance the efficiency and accuracy of ER, this study has made a deliberate selection of the FER2013 dataset, a benchmark for evaluating ER models.
Leveraging the Xception model as a foundational framework, this research implements a transfer learning strategy with a pre-trained model to bolster the quality of the training model. The Xception architecture, known for its depth-wise separable convolutions, allows for a more efficient learning process and is adept at capturing intricate features from visual data. By utilizing a pre-trained model, the study capitalizes on the knowledge garnered from vast and diverse datasets, thus enabling a more robust and generalized feature representation for the task at hand.
Furthermore, to augment the model’s performance on the FER2013 dataset, this study employs data augmentation techniques. These methods introduce controlled variability into the training data by applying random transformations such as cropping, scaling, and flipping, which serve to increase the dataset’s diversity and prevent overfitting. The data augmentation process aims to simulate a broader range of real-world conditions and expressions, thereby improving the model’s ability to generalize and perform accurately across various emotional states.
In the comprehensive comparative analysis conducted within this study, the optimized and fine-tuned Xception model is pitted against other prominent deep learning architectures that are mainstream in the current literature. The performance of the Xception model is evaluated on various metrics, including but not limited to accuracy, precision, recall, and F1 score. The results of this rigorous evaluation reveal that the Xception model, after undergoing the proposed enhancements, surpasses the accuracy of the compared networks, demonstrating its superiority in the context of the FER2013 dataset.
The findings of this study contribute to the growing body of research that advocates for the adoption of deep learning methodologies in the field of computer vision, specifically in the nuanced task of facial emotion recognition. The outcomes underscore the potential of transfer learning and data augmentation in enhancing the performance of deep neural networks, offering a promising avenue for future research and development in the field. This study not only pushes the boundaries of what is achievable in automated facial emotion recognition but also presents a solid foundation for the integration of such models into practical applications, where the accurate interpretation of human emotions can significantly improve user experience and interaction in various sectors, including but not limited to healthcare, customer service, and human-computer interaction.

References

[1]
Albert Mehrabian and Norman Epstein. A measure of emotional empathy. Journal of personality, 1972.
[2]
Qian Li. Research and application of deep learning models for facial emotion recognition. Master’s thesis, Xiangtan University, 2019.
[3]
Jingjing Wu. Emotion recognition technology based on deep learning. Master’s thesis, University of Chinese Academy of Sciences (Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences), 2020.
[4]
Ying Pan. A review of emotion recognition. Computer Knowledge and Technology, 14(08):169–171, 2018.
[5]
Léon Bottou, Corinna Cortes, John S Denker, Harris Drucker, Isabelle Guyon, Larry D Jackel, Yann LeCun, Urs A Muller, Edward Sackinger, Patrice Simard, et al. Comparison of classifier methods: a case study in handwritten digit recognition. In Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3-Conference C: Signal Processing (Cat. No. 94CH3440-5), volume 2, pages 77–82. IEEE, 1994.
[6]
Ian J Goodfellow, Dumitru Erhan, Pierre Luc Carrier, Aaron Courville, Mehdi Mirza, Ben Hamner, Will Cukierski, Yichuan Tang, David Thaler, Dong-Hyun Lee, et al. Challenges in representation learning: A report on three machine learning contests. In Neural information processing: 20th international conference, ICONIP 2013, daegu, korea, november 3-7, 2013. Proceedings, Part III 20, pages 117–124. Springer, 2013.
[7]
Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:https://arXiv.org/abs/1409.1556, 2014.
[8]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
[9]
Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. Mobilenets: efficient convolutional neural networks for mobile vision applications (2017). arXiv preprint arXiv:https://arXiv.org/abs/1704.04861, 126, 2017.
[10]
François Chollet. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1251–1258, 2017.
[11]
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2818–2826, 2016.

Index Terms

  1. Facial Emotion Recognition Based on Optimized Xception Training

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    BDIOT '24: Proceedings of the 2024 8th International Conference on Big Data and Internet of Things
    September 2024
    412 pages
    ISBN:9798400717529
    DOI:10.1145/3697355
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 December 2024

    Check for updates

    Author Tags

    1. Facial Emotion Recognition
    2. Transfer Learning
    3. Convolutional Neural Network

    Qualifiers

    • Research-article

    Conference

    BDIOT 2024

    Acceptance Rates

    Overall Acceptance Rate 75 of 136 submissions, 55%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 20
      Total Downloads
    • Downloads (Last 12 months)20
    • Downloads (Last 6 weeks)9
    Reflects downloads up to 16 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media