Abstract
Facial expression recognition (FER) is a technology that recognizes human emotions based on biometric markers. Over the past decade, FER has been a popular research area, particularly in the computer vision community. With deep learning (DL) development, FER can achieve impressive recognition accuracy. In addition, favorable Internet-of-Things (IoT) advancements generate massive amounts of visual data needed to enable reliable DL-based emotion analysis. However, training DL models can suffer from significant memory consumption and computational costs, complicating many vision tasks. Additionally, the direct use of RGB images during the training and inference stages might raise privacy concerns in various FER applications. On the other hand, adopting large deep networks hampers quick and accurate recognition on resource-constrained end devices such as smartphones. As a viable solution, edge computing can be employed to bring data storage and computation closer to end devices rather than relying on a central cloud server. As a result, it can potentially facilitate the deployment of real-time FER applications since the latency and efficiency problems are well addressed by utilizing the computing resources at the edge. In this paper, we develop an efficient FER framework that integrates DL with edge computing. Our framework relies on facial landmarks to enable privacy-preserving and low-latency FER. Accordingly, various landmark detection models and feature types are studied empirically to investigate their capabilities in capturing the dynamic information of facial expressions in videos. Then, using the extracted landmark-based features, we design lightweight DL models to classify human emotions on IoT devices. Extensive experiments performed on benchmark datasets further validate the efficiency and robustness of our framework.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Sharma P, Sharma P, Deep V, Shukla VK (2021) Facial Emotion Recognition Model. In: Lecture Notes in Mechanical Engineering, pp 751–761. https://doi.org/10.1007/978-981-15-9956-9_73
Li S, Deng W (2020) Deep facial expression recognition: a survey. IEEE Trans Affect Comput 13:1195–1215
Jung H, Lee S, Yim J, Park S, Kim J (2015) Joint fine-tuning in deep neural networks for facial expression recognition. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp 2983–2991
Tu NA, Wong K-S, Demirci MF, Lee Y-K et al (2021) Toward efficient and intelligent video analytics with visual privacy protection for large-scale surveillance. J Supercomput 77(12):14374–14404
Zhao Y, Xu K, Wang H, Li B, Qiao M, Shi H (2021) MEC-enabled hierarchical emotion recognition and perturbation-aware defense in smart cities. IEEE Internet Things J 8(23):16933–16945
Muhammad G, Hossain MS (2021) Emotion recognition for cognitive edge computing using deep learning. IEEE Internet Things J 8(23):16894–16901
Hu M, Wang H, Wang X, Yang J, Wang R (2019) Video facial emotion recognition based on local enhanced motion history image and CNN-CTSLSTM networks. J Vis Commun Image Represent 59:176–185
Munasinghe M (2018) Facial expression recognition using facial landmarks and random forest classifier. In: 2018 IEEE/ACIS 17th International Conference on Computer and Information Science (ICIS), pp 423–427. IEEE
Mollahosseini A, Chan D, Mahoor MH (2016) Going deeper in facial expression recognition using deep neural networks. In: 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp 1–10. IEEE
Melinte DO, Vladareanu L (2020) Facial expressions recognition for human–robot interaction using deep convolutional neural networks with rectified adam optimizer. Sensors 20(8):2393
Siqueira H, Magg S, Wermter S (2020) Efficient facial feature learning with wide ensemble-based convolutional neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 34, pp 5800–5809
Jabbooree AI, Khanli LM, Salehpour P, Pourbahrami S (2023) A novel facial expression recognition algorithm using geometry \(\beta\)-skeleton in fusion based on deep CNN. Image Vis Comput 134:104677
Wang K, Peng X, Yang J, Meng D, Qiao Y (2020) Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans Image Process 29:4057–4069
Minaee S, Minaei M, Abdolrashidi A (2021) Deep-emotion: facial expression recognition using attentional convolutional network. Sensors 21(9):3046
Yang H, Zhang Z, Yin L (2018) Identity-adaptive facial expression recognition through expression regeneration using conditional generative adversarial networks. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp 294–301. IEEE
Chen J, Konrad J, Ishwar P (2018) VGAN-based image representation learning for privacy-preserving facial expression recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 1570–1579
Otberdout N, Daoudi M, Kacem A, Ballihi L, Berretti S (2020) Dynamic facial expression generation on hilbert hypersphere with conditional wasserstein generative adversarial nets. IEEE Trans Pattern Anal Mach Intell 44:848–863
Cai J, Meng Z, Khan AS, O’Reilly J, Li Z, Han S, Tong Y (2021) Identity-free facial expression recognition using conditional generative adversarial network. In: 2021 IEEE International Conference on Image Processing (ICIP), pp 1344–1348. IEEE
Kahou SE, Bouthillier X, Lamblin P, Gulcehre C, Michalski V, Konda K, Jean S, Froumenty P, Dauphin Y, Boulanger-Lewandowski N et al (2016) Emonets: multimodal deep learning approaches for emotion recognition in video. J Multimodal User Interfaces 10(2):99–111
Xu B, Fu Y, Jiang Y-G, Li B, Sigal L (2016) Video emotion recognition with transferred deep feature encodings. In: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, pp 15–22
Abbasnejad I, Sridharan S, Nguyen D, Denman S, Fookes C, Lucey S (2017) Using synthetic data to improve facial expression analysis with 3d convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp 1609–1618
Al Chanti D, Caplier A (2018) Deep learning for spatio-temporal modeling of dynamic spontaneous emotions. IEEE Trans Affect Comput 12(2):363–376
Zhao J, Mao X, Zhang J (2018) Learning deep facial expression features from image and optical flow sequences using 3D CNN. Vis Comput 34(10):1461–1475
Ayral T, Pedersoli M, Bacon S, Granger E (2021) Temporal stochastic softmax for 3D CNNs: an application in facial expression recognition. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 3029–3038
Miyoshi R, Akizuki S, Tobitani K, Nagata N, Hashimoto M (2022) Convolutional neural tree for video-based facial expression recognition embedding emotion wheel as inductive bias. In: 2022 IEEE International Conference on Image Processing (ICIP), pp 3261–3265. IEEE
Baddar WJ, Ro YM (2019) Mode variational LSTM robust to unseen modes of variation: application to facial expression recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 33, pp 3215–3223
Miyoshi R, Nagata N, Hashimoto M (2021) Enhanced convolutional LSTM with spatial and temporal skip connections and temporal gates for facial expression recognition from video. Neural Comput Appl 33(13):7381–7392
Liu D, Zhang H, Zhou P (2021) Video-based facial expression recognition using graph convolutional networks. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp 607–614. IEEE
Jeong D, Kim B-G, Dong S-Y (2020) Deep joint spatiotemporal network (DJSTN) for efficient facial expression recognition. Sensors 20(7):1936
Liu C, Hirota K, Ma J, Jia Z, Dai Y (2021) Facial expression recognition using hybrid features of pixel and geometry. IEEE Access 9:18876–18889
Ngoc QT, Lee S, Song BC (2020) Facial landmark-based emotion recognition via directed graph neural network. Electronics 9(5):764
Sun X, Xia P, Ren F (2021) Multi-attention based deep neural network with hybrid features for dynamic sequential facial expression recognition. Neurocomputing 444:378–389
Zhao R, Liu T, Huang Z, Lun DP-K, Lam KK (2021) Geometry-aware facial expression recognition via attentive graph convolutional networks. IEEE Trans Affect Comput
Gan C, Yao J, Ma S, Zhang Z, Zhu L (2022) The deep spatiotemporal network with dual-flow fusion for video-oriented facial expression recognition. Digit Commun Netw
Singh R, Saurav S, Kumar T, Saini R, Vohra A, Singh S (2023) Facial expression recognition in videos using hybrid CNN & ConvLSTM. Int J Inf Technol 15(4):1819–1830
Jiang X, Yu FR, Song T, Leung VC (2021) A survey on multi-access edge computing applied to video streaming: some research issues and challenges. IEEE Commun Surv Tutor 23(2):871–903
Yu Z, Amin SU, Alhussein M, Lv Z (2021) Research on disease prediction based on improved DeepFM and IoMT. IEEE Access 9:39043–39054
Rahman MA, Hossain MS (2021) An internet-of-medical-things-enabled edge computing framework for tackling COVID-19. IEEE Internet Things J 8(21):15847–15854
Chen J, Li K, Deng Q, Li K, Philip SY (2019) Distributed deep learning model for intelligent video surveillance systems with edge computing. IEEE Trans Ind Inform
He W, Wang Y, Zhou M, Wang B (2022) A novel parameters correction and multivariable decision tree method for edge computing enabled HGR system. Neurocomputing 487:203–213
Zhen P, Chen H-B, Cheng Y, Ji Z, Liu B, Yu H (2021) Fast video facial expression recognition by a deeply tensor-compressed LSTM neural network for mobile devices. ACM Trans Internet Things 2(4):1–26
Chen A, Xing H, Wang F (2020) A facial expression recognition method using deep convolutional neural networks based on edge computing. IEEE Access 8:49741–49751
Dabhi MK, Pancholi BK (2016) Face detection system based on Viola-Jones algorithm. Int J Sci Res (IJSR) 5(4):62–64
Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) Squeezenet: alexnet-level accuracy with 50x fewer parameters and 0.5 MB model size. arXiv preprint arXiv:1602.07360
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4510–4520
Howard A, Sandler M, Chu G, Chen L-C, Chen B, Tan M, Wang W, Zhu Y, Pang R, Vasudevan V, et al. (2019) Searching for mobilenetv3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 1314–1324
Huynh-The T, Hua C-H, Tu NA, Kim D-S (2020) Learning 3d spatiotemporal gait feature by convolutional network for person identification. Neurocomputing 397:192–202
Zhao X, Liang X, Liu L, Li T, Han Y, Vasconcelos N, Yan S (2016) Peak-piloted deep network for facial expression recognition. In: European Conference on Computer Vision, pp 425–442. Springer
Munasinghe M (2018) Facial expression recognition using facial landmarks and random forest classifier. In: 2018 IEEE/ACIS 17th International Conference on Computer and Information Science (ICIS), pp 423–427. IEEE
Presti LL, La Cascia M (2016) 3d skeleton-based human action classification: a survey. Pattern Recognit 53:130–147
Choi S, Kim J, Kim W, Kim C (2019) Skeleton-based gait recognition via robust frame-level matching. IEEE Trans Inf Forensics Secur 14(10):2577–2592
Lachgar M, Benouda H, Elfirdoussi S (2018) Android REST APIs: Volley vs Retrofit. In: 2018 International Symposium on Advanced Electrical and Communication Technologies (ISAECT), pp 1–6. IEEE
Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I (2010) The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-workshops, pp 94–101. IEEE
Pramod: Facial keypoint dataset 150 (2021). https://www.kaggle.com/datasets/pramod722445/facial-keypoint-dataset-150
Postman. https://www.postman.com/
Yang J, Qian T, Zhang F, Khan SU (2021) Real-time facial expression recognition based on edge computing. IEEE Access 9:76178–76190
Tran D, Wang H, Torresani L, Ray J, LeCun Y, Paluri M (2018) A closer look at spatiotemporal convolutions for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6450–6459
Cheng K, Zhang Y, He X, Cheng J, Lu H (2021) Extremely lightweight skeleton-based action recognition with shiftgcn++. IEEE Trans Image Process 30:7333–7348
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531
Kairouz P, McMahan HB, Avent B, Bellet A, Bennis M, Bhagoji AN, Bonawitz K, Charles Z, Cormode G, Cummings R et al (2021) Advances and open problems in federated learning. Found Trends Mach Learn 14(1–2):1–210
He C, Annavaram M, Avestimehr S (2020) Group knowledge transfer: federated learning of large CNNs at the edge. Adv Neural Inf Process Syst 33:14068–14080
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 10012–10022
Bertasius G, Wang H, Torresani L (2021) Is space-time attention all you need for video understanding? In: ICML, vol 2, p 4
Funding
This work was supported by Faculty Development Competitive Research Grant Program (Funder Project Reference: 11022021FD2925 and 021220FD1451) at Nazarbayev University.
Author information
Authors and Affiliations
Contributions
NA elaborated on the methodology, conceptualized the technical content, performed the experiments and analysis, wrote the paper. AZ elaborated on the methodology, conceptualized the technical content, wrote the paper. TA contributed data or analysis tools and wrote the paper. D-MB performed the experiments and analysis and reviewed the manuscript. NAT elaborated on the methodology, conceptualized the technical content, wrote and reviewed the paper.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Ethical approval
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Aikyn, N., Zhanegizov, A., Aidarov, T. et al. Efficient facial expression recognition framework based on edge computing. J Supercomput 80, 1935–1972 (2024). https://doi.org/10.1007/s11227-023-05548-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-023-05548-x