research-article

Edge-AI-Driven Framework with Efficient Mobile Network Design for Facial Expression Recognition

Authors:

Shaohua WanAuthors Info & Claims

ACM Transactions on Embedded Computing Systems, Volume 22, Issue 3

Article No.: 57, Pages 1 - 17

https://doi.org/10.1145/3587038

Published: 19 April 2023 Publication History

Abstract

Facial Expression Recognition (FER) in the wild poses significant challenges due to realistic occlusions, illumination, scale, and head pose variations of the facial images. In this article, we propose an Edge-AI-driven framework for FER. On the algorithms aspect, we propose two attention modules, Arbitrary-oriented Spatial Pooling (ASP) and Scalable Frequency Pooling (SFP), for effective feature extraction to improve classification accuracy. On the systems aspect, we propose an edge-cloud joint inference architecture for FER to achieve low-latency inference, consisting of a lightweight backbone network running on the edge device, and two optional attention modules partially offloaded to the cloud. Performance evaluation demonstrates that our approach achieves a good balance between classification accuracy and inference latency.

References

[1]

Zaid Al-bayati, Qingling Zhao, Ahmed Youssef, Haibo Zeng, and Zonghua Gu. 2015. Enhanced partitioned scheduling of mixed-criticality systems on multicore platforms. In Proceedings of the 20th Asia and South Pacific Design Automation Conference. IEEE, 630–635.

[2]

Emad Barsoum, Cha Zhang, Cristian Canton Ferrer, and Zhengyou Zhang. 2016. Training deep networks for facial expression recognition with crowd-sourced label distribution. In Proceedings of the ACM International Conference on Multimodal Interaction. 279–283.

Digital Library

[3]

Ardhendu Behera, Zachary Wharton, Pradeep R. P. G. Hewage, and Asish Bera. 2021. Context-aware attentional pooling (CAP) for fine-grained visual classification. In Proceedings of the 35th AAAI Conference on Artificial Intelligence. 929–937.

[4]

Kartikeya Bhardwaj, Chingyi Lin, Anderson L. Sartor, and Radu Marculescu. 2019. Memory- and communication-aware model compression for distributed deep learning inference on IoT. ACM Transactions on Embedded Computing Systems 18, 5s (2019), 82:1–82:22.

Digital Library

[5]

Tejalal Choudhary, Vipul Kumar Mishra, Anurag Goswami, and Jagannathan Sarangapani. 2020. A comprehensive survey on model compression and acceleration. Artificial Intelligence Review 53, 7 (2020), 5113–5155.

Digital Library

[6]

Abhinav Dhall, Roland Goecke, Simon Lucey, and Tom Gedeon. 2011. Static facial expression analysis in tough conditions: Data, evaluation protocol, and benchmark. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 2106–2112.

[7]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.

[8]

Qibin Hou, Daquan Zhou, and Jiashi Feng. 2021. Coordinate attention for efficient mobile network design. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 13713–13722.

[9]

Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7132–7141.

[10]

Nitthilan Kanappan Jayakodi, Syrine Belakaria, Aryan Deshwal, and Janardhan Rao Doppa. 2020. Design and optimization of energy-accuracy tradeoff networks for mobile platforms via pretrained deep models. ACM Transactions on Embedded Computing Systems 19, 1 (2020), 4:1–4:24.

Digital Library

[11]

Nitthilan Kanappan Jayakodi, Janardhan Rao Doppa, and Partha Pratim Pande. 2020. SETGAN: Scale and energy tradeoff GANs for image applications on mobile platforms. In Proceedings of the IEEE/ACM International Conference On Computer Aided Design. 23:1–23:9.

Digital Library

[12]

Xin Jin, Yanping Xie, Xiu-Shen Wei, Borui Zhao, Zhao-Min Chen, and Xiaoyang Tan. 2022. Delving deep into spatial pooling for squeeze-and-excitation networks. Pattern Recognition 121 (2022), 108159.

Digital Library

[13]

Oliver Langner, Ron Dotsch, Gijsbert Bijlstra, Daniel H. J. Wigboldus, Skyler T. Hawk, and A. D. Van Knippenberg. 2010. Presentation and validation of the Radboud faces database. Cognition and Emotion 24, 8 (2010), 1377–1388.

[14]

En Li, Liekang Zeng, Zhi Zhou, and Xu Chen. 2020. Edge AI: On-demand accelerating deep neural network inference via edge computing. IEEE Transactions on Wireless Communications 19, 1 (2020), 447–457.

[15]

Yong Li, Jiabei Zeng, Shiguang Shan, and Xilin Chen. 2018. Patch-gated CNN for occlusion-aware facial expression recognition. In Proceedings of the International Conference on Pattern Recognition. 2209–2214.

[16]

Yong Li, Jiabei Zeng, Shiguang Shan, and Xilin Chen. 2019. Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Transactions on Image Processing 28, 5 (2019), 2439–2450.

[17]

Guangdong Liu, Ying Lu, Shige Wang, and Zonghua Gu. 2014. Partitioned multiprocessor scheduling of mixed-criticality parallel jobs. In Proceedings of the 2014 IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications. IEEE, 1–10.

[18]

Yuanyuan Liu, Xiaohui Yuan, Xi Gong, Zhong Xie, Fang Fang, and Zhongwen Luo. 2018. Conditional convolution neural network enhanced random forest for facial expression recognition. Pattern Recognition 84 (2018), 251–261.

Digital Library

[19]

Siyu Luan, Zonghua Gu, Rui Xu, Qingling Zhao, and Gang Chen. 2023. LRP-based network pruning and policy distillation of robust and non-robust DRL agents for embedded systems. Concurrency and Computation: Practice and Experience (2023), e7351.

[20]

Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. 2018. ShuffleNet V2: Practical guidelines for efficient CNN architecture design. In Proceedings of the European Conference on Computer Vision. 122–138.

Digital Library

[21]

Wenjia Meng, Zonghua Gu, Ming Zhang, and Zhaohui Wu. 2017. Two-bit networks for deep learning on resource-constrained embedded devices. CoRR abs/1701.00485 (2017).

[22]

Youngmin Oh, Beomjun Kim, and Bumsub Ham. 2021. Background-aware pooling and noise-aware loss for weakly-supervised semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6913–6922.

[23]

Priyadarshini Panda, Abhronil Sengupta, and Kaushik Roy. 2016. Conditional deep learning for energy-efficient and enhanced pattern recognition. In Proceedings of the Design, Automation, and Test in Europe Conference and Exhibition. 475–480.

[24]

Jongchan Park, Sanghyun Woo, Joon-Young Lee, and In So Kweon. 2018. Bam: Bottleneck attention module. In Proceedings of British Machine Vision Conference. 147–160.

[25]

Zequn Qin, Pengyi Zhang, Fei Wu, and Xi Li. 2021. Fcanet: Frequency channel attention networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 783–792.

[26]

Mark Sandler, Andrew G. Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. MobileNetV2: Inverted residuals and linear bottlenecks. In Proceedings of the CVPR. 4510–4520.

[27]

Mathijs Schuurmans, Maxim Berman, and Matthew B. Blaschko. 2018. Efficient semantic image segmentation with superpixel pooling. CoRR abs/1806.02705 (2018).

[28]

Vladislav Sovrasov. 2022. Ptflops: A flops counting tool for neural networks in pytorch framework. https://github.com/sovrasov/flops-counter.pytorch.

[29]

Dimitrios Stamoulis, Ting-Wu (Rudy) Chin, Anand Krishnan Prakash, Haocheng Fang, Sribhuvan Sajja, Mitchell Bognar, and Diana Marculescu. 2018. Designing adaptive neural networks for energy-constrained image classification. In Proceedings of the International Conference on Computer-Aided Design. 23.

Digital Library

[30]

Stefan Van Der Walt, S. Chris Colbert, and Gael Varoquaux. 2011. The NumPy array: A structure for efficient numerical computation. Computing in Science and Engineering 13, 2 (2011), 22–30.

Digital Library

[31]

Noranart Vesdapunt and Baoyuan Wang. 2021. CRFace: Confidence ranker for model-agnostic face detection refinement. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1674–1684.

[32]

Kai Wang, Xiaojiang Peng, Jianfei Yang, Debin Meng, and Yu Qiao. 2020. Region attention networks for pose and occlusion robust facial expression recognition. IEEE Transactions on Image Processing 29 (2020), 4057–4069.

Digital Library

[33]

Qilong Wang, Banggu Wu, Pengfei Zhu, Peihua Li, Wangmeng Zuo, and Qinghua Hu. 2020. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11534–11542.

[34]

Wenjing Wang, Wenhan Yang, and Jiaying Liu. 2021. HLA-Face: Joint high-low adaptation for low light face detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 16195–16204.

[35]

Sanghyun Woo, Jongchan Park, Joon-Young Lee, and In So Kweon. 2018. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision. Vol. 11211. 3–19.

Digital Library

[36]

Yirui Wu, Haifeng Guo, Chinmay Chakraborty, Mohammad Khosravi, Stefano Berretti, and Shaohua Wan. 2023. Edge computing driven low-light image dynamic enhancement for object detection. IEEE Transactions on Network Science and Engineering (2023). DOI:

[37]

Siyue Xie, Haifeng Hu, and Yongbo Wu. 2019. Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition. Pattern Recognition 92 (2019), 177–191.

Digital Library

[38]

Shuangfei Zhai, Hui Wu, Abhishek Kumar, Yu Cheng, Yongxi Lu, Zhongfei Zhang, and Rogério Schmidt Feris. 2017. S3Pool: Pooling with stochastic spatial sampling. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 4003–4011.

[39]

Zengqun Zhao, Qingshan Liu, and Shanmin Wang. 2021. Learning deep global multi-scale and local attention features for facial expression recognition in the wild. IEEE Transactions on Image Processing 30 (2021), 6544–6556.

Digital Library

[40]

Zengqun Zhao, Qingshan Liu, and Feng Zhou. 2021. Robust lightweight facial expression recognition network with label distribution training. In Proceedings of the AAAI. 3510–3519.

Cited By

Gornale SPalaiahnakote SUnki AVadera S(2025)Spatial-Frequency Based EEG Features for Classification of Human EmotionsInternational Journal of Pattern Recognition and Artificial Intelligence10.1142/S0218001424570143Online publication date: 21-Jan-2025
https://doi.org/10.1142/S0218001424570143
Roy APalaiahnakote SPal ULiu C(2025)A novel domain independent scene text localizerPattern Recognition10.1016/j.patcog.2024.111015158(111015)Online publication date: Feb-2025
https://doi.org/10.1016/j.patcog.2024.111015
Song Y(2024)The Optimization of Face Recognition Technology Based on Convolutional Neural NetworkApplied Mathematics and Nonlinear Sciences10.2478/amns-2024-34209:1Online publication date: 22-Nov-2024
https://doi.org/10.2478/amns-2024-3420
Show More Cited By

Index Terms

Edge-AI-Driven Framework with Efficient Mobile Network Design for Facial Expression Recognition
1. Computer systems organization
  1. Real-time systems
2. Computing methodologies
  1. Machine learning
    1. Machine learning algorithms

Recommendations

Expression-invariant face recognition by facial expression transformations

In this paper, we present a method of expression-invariant face recognition that transforms input face image with an arbitrary expression into its corresponding neutral facial expression image. When a new face image with an arbitrary expression is ...
Facial expression recognition with Convolutional Neural Networks

Facial expression recognition has been an active research area in the past 10 years, with growing application areas including avatar animation, neuromarketing and sociable robots. The recognition of facial expressions is not an easy problem for machine ...
Pose-Robust Facial Expression Recognition Using View-Based 2D + 3D AAM

This paper proposes a pose-robust face tracking and facial expression recognition method using a view-based 2D 3D active appearance model (AAM) that extends the 2D 3D AAM to the view-based approach, where one independent face model is used for a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems

ACM Transactions on Embedded Computing Systems Volume 22, Issue 3

May 2023

519 pages

ISSN:1539-9087

EISSN:1558-3465

DOI:10.1145/3592782

Editor:
Tulika Mitra
National University of Singapore, Singapore

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

ACM Journals for the Design of Smart and Connected Systems

Publication History

Published: 19 April 2023

Online AM: 06 March 2023

Accepted: 17 February 2023

Revised: 12 December 2022

Received: 06 May 2022

Published in TECS Volume 22, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
Key Project of Shenzhen City Special Fund for Fundamental Research
National Key R&D Program of China
Fundamental Research Funds for the Central Universities
Fundamental Research Funds for the Central Universities, JLU, Joint Foundation of the Ministry of Education
Kempe Foundation, Sweden

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

48
Total Citations
View Citations
1,301
Total Downloads

Downloads (Last 12 months)692
Downloads (Last 6 weeks)57

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Gornale SPalaiahnakote SUnki AVadera S(2025)Spatial-Frequency Based EEG Features for Classification of Human EmotionsInternational Journal of Pattern Recognition and Artificial Intelligence10.1142/S0218001424570143Online publication date: 21-Jan-2025
https://doi.org/10.1142/S0218001424570143
Roy APalaiahnakote SPal ULiu C(2025)A novel domain independent scene text localizerPattern Recognition10.1016/j.patcog.2024.111015158(111015)Online publication date: Feb-2025
https://doi.org/10.1016/j.patcog.2024.111015
Song Y(2024)The Optimization of Face Recognition Technology Based on Convolutional Neural NetworkApplied Mathematics and Nonlinear Sciences10.2478/amns-2024-34209:1Online publication date: 22-Nov-2024
https://doi.org/10.2478/amns-2024-3420
Patil GPalaiahnakote SGornale SLopresti D(2024)Altered Handwritten Text Detection in Document Images Using Deep LearningInternational Journal of Pattern Recognition and Artificial Intelligence10.1142/S021800142452006238:03Online publication date: 5-Apr-2024
https://doi.org/10.1142/S0218001424520062
Hu YGao JDong JFan BLiu H(2024)Exploring Rich Semantics for Open-Set Action RecognitionIEEE Transactions on Multimedia10.1109/TMM.2023.333320626(5410-5421)Online publication date: 2024
https://doi.org/10.1109/TMM.2023.3333206
Wang XKong WZhang QYang YZhao TJiang J(2024)Distortion-Aware Self-Supervised Indoor 360$^{\circ }$ Depth Estimation via Hybrid Projection Fusion and Structural RegularitiesIEEE Transactions on Multimedia10.1109/TMM.2023.331847026(3998-4011)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2023.3318470
Li HLi XFan QHe QWang XLeung V(2024)Distributed DNN Inference With Fine-Grained Model Partitioning in Mobile Edge Computing NetworksIEEE Transactions on Mobile Computing10.1109/TMC.2024.335787423:10(9060-9074)Online publication date: Oct-2024
https://doi.org/10.1109/TMC.2024.3357874
Chen LDuan WLi JWu MPedrycz WHirota K(2024)Attention-Based Deep Neural Network Combined Local and Global Features for Indoor Scene RecognitionIEEE Transactions on Industrial Informatics10.1109/TII.2024.342419720:11(12684-12693)Online publication date: Nov-2024
https://doi.org/10.1109/TII.2024.3424197
Wang XShen MYang K(2024)On-Edge High-Throughput Collaborative Inference for Real-Time Video AnalyticsIEEE Internet of Things Journal10.1109/JIOT.2024.342423511:20(33097-33109)Online publication date: 15-Oct-2024
https://doi.org/10.1109/JIOT.2024.3424235
Wang BLi HLiu XGuo Y(2024)FRAD: Free-Rider Attacks Detection Mechanism for Federated Learning in AIoTIEEE Internet of Things Journal10.1109/JIOT.2023.329860611:3(4377-4388)Online publication date: 1-Feb-2024
https://doi.org/10.1109/JIOT.2023.3298606
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents