Skip to main content

Advertisement

Log in

Efficient facial expression recognition framework based on edge computing

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Facial expression recognition (FER) is a technology that recognizes human emotions based on biometric markers. Over the past decade, FER has been a popular research area, particularly in the computer vision community. With deep learning (DL) development, FER can achieve impressive recognition accuracy. In addition, favorable Internet-of-Things (IoT) advancements generate massive amounts of visual data needed to enable reliable DL-based emotion analysis. However, training DL models can suffer from significant memory consumption and computational costs, complicating many vision tasks. Additionally, the direct use of RGB images during the training and inference stages might raise privacy concerns in various FER applications. On the other hand, adopting large deep networks hampers quick and accurate recognition on resource-constrained end devices such as smartphones. As a viable solution, edge computing can be employed to bring data storage and computation closer to end devices rather than relying on a central cloud server. As a result, it can potentially facilitate the deployment of real-time FER applications since the latency and efficiency problems are well addressed by utilizing the computing resources at the edge. In this paper, we develop an efficient FER framework that integrates DL with edge computing. Our framework relies on facial landmarks to enable privacy-preserving and low-latency FER. Accordingly, various landmark detection models and feature types are studied empirically to investigate their capabilities in capturing the dynamic information of facial expressions in videos. Then, using the extracted landmark-based features, we design lightweight DL models to classify human emotions on IoT devices. Extensive experiments performed on benchmark datasets further validate the efficiency and robustness of our framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availibility

Datasets used can be accessed from [53] and [54].

References

  1. Sharma P, Sharma P, Deep V, Shukla VK (2021) Facial Emotion Recognition Model. In: Lecture Notes in Mechanical Engineering, pp 751–761. https://doi.org/10.1007/978-981-15-9956-9_73

  2. Li S, Deng W (2020) Deep facial expression recognition: a survey. IEEE Trans Affect Comput 13:1195–1215

    Article  Google Scholar 

  3. Jung H, Lee S, Yim J, Park S, Kim J (2015) Joint fine-tuning in deep neural networks for facial expression recognition. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp 2983–2991

  4. Tu NA, Wong K-S, Demirci MF, Lee Y-K et al (2021) Toward efficient and intelligent video analytics with visual privacy protection for large-scale surveillance. J Supercomput 77(12):14374–14404

    Article  Google Scholar 

  5. Zhao Y, Xu K, Wang H, Li B, Qiao M, Shi H (2021) MEC-enabled hierarchical emotion recognition and perturbation-aware defense in smart cities. IEEE Internet Things J 8(23):16933–16945

    Article  Google Scholar 

  6. Muhammad G, Hossain MS (2021) Emotion recognition for cognitive edge computing using deep learning. IEEE Internet Things J 8(23):16894–16901

    Article  Google Scholar 

  7. Hu M, Wang H, Wang X, Yang J, Wang R (2019) Video facial emotion recognition based on local enhanced motion history image and CNN-CTSLSTM networks. J Vis Commun Image Represent 59:176–185

    Article  Google Scholar 

  8. Munasinghe M (2018) Facial expression recognition using facial landmarks and random forest classifier. In: 2018 IEEE/ACIS 17th International Conference on Computer and Information Science (ICIS), pp 423–427. IEEE

  9. Mollahosseini A, Chan D, Mahoor MH (2016) Going deeper in facial expression recognition using deep neural networks. In: 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp 1–10. IEEE

  10. Melinte DO, Vladareanu L (2020) Facial expressions recognition for human–robot interaction using deep convolutional neural networks with rectified adam optimizer. Sensors 20(8):2393

    Article  Google Scholar 

  11. Siqueira H, Magg S, Wermter S (2020) Efficient facial feature learning with wide ensemble-based convolutional neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 34, pp 5800–5809

  12. Jabbooree AI, Khanli LM, Salehpour P, Pourbahrami S (2023) A novel facial expression recognition algorithm using geometry \(\beta\)-skeleton in fusion based on deep CNN. Image Vis Comput 134:104677

    Article  Google Scholar 

  13. Wang K, Peng X, Yang J, Meng D, Qiao Y (2020) Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans Image Process 29:4057–4069

    Article  Google Scholar 

  14. Minaee S, Minaei M, Abdolrashidi A (2021) Deep-emotion: facial expression recognition using attentional convolutional network. Sensors 21(9):3046

    Article  Google Scholar 

  15. Yang H, Zhang Z, Yin L (2018) Identity-adaptive facial expression recognition through expression regeneration using conditional generative adversarial networks. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp 294–301. IEEE

  16. Chen J, Konrad J, Ishwar P (2018) VGAN-based image representation learning for privacy-preserving facial expression recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 1570–1579

  17. Otberdout N, Daoudi M, Kacem A, Ballihi L, Berretti S (2020) Dynamic facial expression generation on hilbert hypersphere with conditional wasserstein generative adversarial nets. IEEE Trans Pattern Anal Mach Intell 44:848–863

    Article  Google Scholar 

  18. Cai J, Meng Z, Khan AS, O’Reilly J, Li Z, Han S, Tong Y (2021) Identity-free facial expression recognition using conditional generative adversarial network. In: 2021 IEEE International Conference on Image Processing (ICIP), pp 1344–1348. IEEE

  19. Kahou SE, Bouthillier X, Lamblin P, Gulcehre C, Michalski V, Konda K, Jean S, Froumenty P, Dauphin Y, Boulanger-Lewandowski N et al (2016) Emonets: multimodal deep learning approaches for emotion recognition in video. J Multimodal User Interfaces 10(2):99–111

    Article  Google Scholar 

  20. Xu B, Fu Y, Jiang Y-G, Li B, Sigal L (2016) Video emotion recognition with transferred deep feature encodings. In: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, pp 15–22

  21. Abbasnejad I, Sridharan S, Nguyen D, Denman S, Fookes C, Lucey S (2017) Using synthetic data to improve facial expression analysis with 3d convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp 1609–1618

  22. Al Chanti D, Caplier A (2018) Deep learning for spatio-temporal modeling of dynamic spontaneous emotions. IEEE Trans Affect Comput 12(2):363–376

    Article  Google Scholar 

  23. Zhao J, Mao X, Zhang J (2018) Learning deep facial expression features from image and optical flow sequences using 3D CNN. Vis Comput 34(10):1461–1475

    Article  Google Scholar 

  24. Ayral T, Pedersoli M, Bacon S, Granger E (2021) Temporal stochastic softmax for 3D CNNs: an application in facial expression recognition. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 3029–3038

  25. Miyoshi R, Akizuki S, Tobitani K, Nagata N, Hashimoto M (2022) Convolutional neural tree for video-based facial expression recognition embedding emotion wheel as inductive bias. In: 2022 IEEE International Conference on Image Processing (ICIP), pp 3261–3265. IEEE

  26. Baddar WJ, Ro YM (2019) Mode variational LSTM robust to unseen modes of variation: application to facial expression recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 33, pp 3215–3223

  27. Miyoshi R, Nagata N, Hashimoto M (2021) Enhanced convolutional LSTM with spatial and temporal skip connections and temporal gates for facial expression recognition from video. Neural Comput Appl 33(13):7381–7392

    Article  Google Scholar 

  28. Liu D, Zhang H, Zhou P (2021) Video-based facial expression recognition using graph convolutional networks. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp 607–614. IEEE

  29. Jeong D, Kim B-G, Dong S-Y (2020) Deep joint spatiotemporal network (DJSTN) for efficient facial expression recognition. Sensors 20(7):1936

    Article  Google Scholar 

  30. Liu C, Hirota K, Ma J, Jia Z, Dai Y (2021) Facial expression recognition using hybrid features of pixel and geometry. IEEE Access 9:18876–18889

    Article  Google Scholar 

  31. Ngoc QT, Lee S, Song BC (2020) Facial landmark-based emotion recognition via directed graph neural network. Electronics 9(5):764

    Article  Google Scholar 

  32. Sun X, Xia P, Ren F (2021) Multi-attention based deep neural network with hybrid features for dynamic sequential facial expression recognition. Neurocomputing 444:378–389

    Article  Google Scholar 

  33. Zhao R, Liu T, Huang Z, Lun DP-K, Lam KK (2021) Geometry-aware facial expression recognition via attentive graph convolutional networks. IEEE Trans Affect Comput

  34. Gan C, Yao J, Ma S, Zhang Z, Zhu L (2022) The deep spatiotemporal network with dual-flow fusion for video-oriented facial expression recognition. Digit Commun Netw

  35. Singh R, Saurav S, Kumar T, Saini R, Vohra A, Singh S (2023) Facial expression recognition in videos using hybrid CNN & ConvLSTM. Int J Inf Technol 15(4):1819–1830

    Google Scholar 

  36. Jiang X, Yu FR, Song T, Leung VC (2021) A survey on multi-access edge computing applied to video streaming: some research issues and challenges. IEEE Commun Surv Tutor 23(2):871–903

    Article  Google Scholar 

  37. Yu Z, Amin SU, Alhussein M, Lv Z (2021) Research on disease prediction based on improved DeepFM and IoMT. IEEE Access 9:39043–39054

    Article  Google Scholar 

  38. Rahman MA, Hossain MS (2021) An internet-of-medical-things-enabled edge computing framework for tackling COVID-19. IEEE Internet Things J 8(21):15847–15854

    Article  Google Scholar 

  39. Chen J, Li K, Deng Q, Li K, Philip SY (2019) Distributed deep learning model for intelligent video surveillance systems with edge computing. IEEE Trans Ind Inform

  40. He W, Wang Y, Zhou M, Wang B (2022) A novel parameters correction and multivariable decision tree method for edge computing enabled HGR system. Neurocomputing 487:203–213

    Article  Google Scholar 

  41. Zhen P, Chen H-B, Cheng Y, Ji Z, Liu B, Yu H (2021) Fast video facial expression recognition by a deeply tensor-compressed LSTM neural network for mobile devices. ACM Trans Internet Things 2(4):1–26

    Article  Google Scholar 

  42. Chen A, Xing H, Wang F (2020) A facial expression recognition method using deep convolutional neural networks based on edge computing. IEEE Access 8:49741–49751

    Article  Google Scholar 

  43. Dabhi MK, Pancholi BK (2016) Face detection system based on Viola-Jones algorithm. Int J Sci Res (IJSR) 5(4):62–64

    Article  Google Scholar 

  44. Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) Squeezenet: alexnet-level accuracy with 50x fewer parameters and 0.5 MB model size. arXiv preprint arXiv:1602.07360

  45. Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4510–4520

  46. Howard A, Sandler M, Chu G, Chen L-C, Chen B, Tan M, Wang W, Zhu Y, Pang R, Vasudevan V, et al. (2019) Searching for mobilenetv3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 1314–1324

  47. Huynh-The T, Hua C-H, Tu NA, Kim D-S (2020) Learning 3d spatiotemporal gait feature by convolutional network for person identification. Neurocomputing 397:192–202

    Article  Google Scholar 

  48. Zhao X, Liang X, Liu L, Li T, Han Y, Vasconcelos N, Yan S (2016) Peak-piloted deep network for facial expression recognition. In: European Conference on Computer Vision, pp 425–442. Springer

  49. Munasinghe M (2018) Facial expression recognition using facial landmarks and random forest classifier. In: 2018 IEEE/ACIS 17th International Conference on Computer and Information Science (ICIS), pp 423–427. IEEE

  50. Presti LL, La Cascia M (2016) 3d skeleton-based human action classification: a survey. Pattern Recognit 53:130–147

    Article  Google Scholar 

  51. Choi S, Kim J, Kim W, Kim C (2019) Skeleton-based gait recognition via robust frame-level matching. IEEE Trans Inf Forensics Secur 14(10):2577–2592

    Article  Google Scholar 

  52. Lachgar M, Benouda H, Elfirdoussi S (2018) Android REST APIs: Volley vs Retrofit. In: 2018 International Symposium on Advanced Electrical and Communication Technologies (ISAECT), pp 1–6. IEEE

  53. Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I (2010) The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-workshops, pp 94–101. IEEE

  54. Pramod: Facial keypoint dataset 150 (2021). https://www.kaggle.com/datasets/pramod722445/facial-keypoint-dataset-150

  55. Postman. https://www.postman.com/

  56. Yang J, Qian T, Zhang F, Khan SU (2021) Real-time facial expression recognition based on edge computing. IEEE Access 9:76178–76190

    Article  Google Scholar 

  57. Tran D, Wang H, Torresani L, Ray J, LeCun Y, Paluri M (2018) A closer look at spatiotemporal convolutions for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6450–6459

  58. Cheng K, Zhang Y, He X, Cheng J, Lu H (2021) Extremely lightweight skeleton-based action recognition with shiftgcn++. IEEE Trans Image Process 30:7333–7348

    Article  Google Scholar 

  59. Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531

  60. Kairouz P, McMahan HB, Avent B, Bellet A, Bennis M, Bhagoji AN, Bonawitz K, Charles Z, Cormode G, Cummings R et al (2021) Advances and open problems in federated learning. Found Trends Mach Learn 14(1–2):1–210

    Article  Google Scholar 

  61. He C, Annavaram M, Avestimehr S (2020) Group knowledge transfer: federated learning of large CNNs at the edge. Adv Neural Inf Process Syst 33:14068–14080

    Google Scholar 

  62. Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 10012–10022

  63. Bertasius G, Wang H, Torresani L (2021) Is space-time attention all you need for video understanding? In: ICML, vol 2, p 4

Download references

Funding

This work was supported by Faculty Development Competitive Research Grant Program (Funder Project Reference: 11022021FD2925 and 021220FD1451) at Nazarbayev University.

Author information

Authors and Affiliations

Authors

Contributions

NA elaborated on the methodology, conceptualized the technical content, performed the experiments and analysis, wrote the paper. AZ elaborated on the methodology, conceptualized the technical content, wrote the paper. TA contributed data or analysis tools and wrote the paper. D-MB performed the experiments and analysis and reviewed the manuscript. NAT elaborated on the methodology, conceptualized the technical content, wrote and reviewed the paper.

Corresponding author

Correspondence to Nguyen Anh Tu.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aikyn, N., Zhanegizov, A., Aidarov, T. et al. Efficient facial expression recognition framework based on edge computing. J Supercomput 80, 1935–1972 (2024). https://doi.org/10.1007/s11227-023-05548-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-023-05548-x

Keywords