research-article

In the Blink of an Eye: Event-based Emotion Recognition

Authors:

Xin YangAuthors Info & Claims

SIGGRAPH '23: ACM SIGGRAPH 2023 Conference Proceedings

Article No.: 28, Pages 1 - 11

https://doi.org/10.1145/3588432.3591511

Published: 23 July 2023 Publication History

Abstract

We introduce a wearable single-eye emotion recognition device and a real-time approach to recognizing emotions from partial observations of an emotion that is robust to changes in lighting conditions. At the heart of our method is a bio-inspired event-based camera setup and a newly designed lightweight Spiking Eye Emotion Network (SEEN). Compared to conventional cameras, event-based cameras offer a higher dynamic range (up to 140 dB vs. 80 dB) and a higher temporal resolution (in the order of μ s vs. 10s of ms). Thus, the captured events can encode rich temporal cues under challenging lighting conditions. However, these events lack texture information, posing problems in decoding temporal information effectively. SEEN tackles this issue from two different perspectives. First, we adopt convolutional spiking layers to take advantage of the spiking neural network’s ability to decode pertinent temporal information. Second, SEEN learns to extract essential spatial cues from corresponding intensity frames and leverages a novel weight-copy scheme to convey spatial attention to the convolutional spiking layers during training and inference. We extensively validate and demonstrate the effectiveness of our approach on a specially collected Single-eye Event-based Emotion (SEE) dataset. To the best of our knowledge, our method is the first eye-based emotion recognition method that leverages event-based cameras and spiking neural networks.

Supplemental Material

MP4 File

presentation

Download
376.35 MB

PDF File

Supplementary Material

Download
273.18 KB

PDF File

supp

Download
273.46 KB

References

[1]

Bradley M. Appelhans and Linda J. Luecken. 2006. Heart Rate Variability as an Index of Regulated Emotional Responding. Review of General Psychology 10, 3 (2006), 229–240. https://doi.org/10.1037/1089-2680.10.3.229

[2]

Xavier P. Burgos-Artizzu, Julien Fleureau, Olivier Dumas, Thierry Tapie, François LeClerc, and Nicolas Mollet. 2015. Real-Time Expression-Sensitive HMD Face Reconstruction. In SIGGRAPH Asia 2015 Technical Briefs (Kobe, Japan) (SA ’15). Association for Computing Machinery, New York, NY, USA, Article 9, 4 pages. https://doi.org/10.1145/2820903.2820910

Digital Library

[3]

Jean Costa, François Guimbretière, Malte F. Jung, and Tanzeem Choudhury. 2019. BoostMeUp: Improving Cognitive Performance in the Moment by Unobtrusively Regulating Emotions with a Smartwatch. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 3, 2, Article 40 (jun 2019), 23 pages. https://doi.org/10.1145/3328911

Digital Library

[4]

David Couret, Pierre Simeone, Sébastien Freppel, and Lionel J Velly. 2019. The effect of ambient-light conditions on quantitative pupillometry: a history of rubber cup. Neurocritical Care 30 (2019), 492–493.

[5]

Didan Deng, Zhaokang Chen, and Bertram E Shi. 2020a. Multitask emotion recognition with incomplete labels. In 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020) (Buenos Aires, Argentina). IEEE, 592–599. https://doi.org/10.1109/FG47880.2020.00131

Digital Library

[6]

Didan Deng, Zhaokang Chen, Yuqian Zhou, and Bertram Shi. 2020b. Mimamo net: Integrating micro-and macro-motion for video emotion recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. Assoc Advancement Artificial Intelligence, 2621–2628.

[7]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. 248–255. https://doi.org/10.1109/CVPR.2009.5206848

[8]

Jianchuan Ding, Bo Dong, Felix Heide, Yufei Ding, Yunduo Zhou, Baocai Yin, and Xin Yang. 2022. Biologically Inspired Dynamic Thresholds for Spiking Neural Networks. In Advances in Neural Information Processing Systems. https://doi.org/10.48550/arXiv.2206.04426

[9]

Paul Ekman and Wallace V Friesen. 1978. Facial action coding systems. Consulting Psychologists Press.

[10]

Guillermo Gallego, Tobi Delbrück, Garrick Orchard, Chiara Bartolozzi, Brian Taba, Andrea Censi, Stefan Leutenegger, Andrew J. Davison, Jörg Conradt, Kostas Daniilidis, and Davide Scaramuzza. 2022. Event-Based Vision: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 1 (2022), 154–180. https://doi.org/10.1109/TPAMI.2020.3008413

Digital Library

[11]

Daniel Gehrig, Antonio Loquercio, Konstantinos G Derpanis, and Davide Scaramuzza. 2019. End-to-end learning of representations for asynchronous event-based data. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5633–5643. https://doi.org/10.1109/ICCV.2019.00573

[12]

Daniel Gehrig, Michelle Rüegg, Mathias Gehrig, Javier Hidalgo-Carrió, and Davide Scaramuzza. 2021. Combining Events and Frames Using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction. IEEE Robotics and Automation Letters 6, 2 (2021), 2822–2829. https://doi.org/10.1109/LRA.2021.3060707

[13]

Mariana-Iuliana Georgescu and Radu Tudor Ionescu. 2019. Recognizing facial expressions of occluded faces using convolutional neural networks. In International Conference on Neural Information Processing, Vol. 1142. Springer, 645–653. https://doi.org/10.1007/978-3-030-36808-1_70

[14]

Wulfram Gerstner and Werner M. Kistler. 2002. Spiking Neuron Models: Single Neurons, Populations, Plasticity.

[15]

Anna Gruebler and Kenji Suzuki. 2014. Design of a Wearable Device for Reading Positive Expressions from Facial EMG Signals. IEEE Transactions on Affective Computing 5, 3 (2014), 227–237. https://doi.org/10.1109/TAFFC.2014.2313557

[16]

Kensho Hara, Hirokatsu Kataoka, and Yutaka Satoh. 2018. Can spatiotemporal 3d cnns retrace the history of 2d cnns and imagenet?. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 6546–6555. https://doi.org/10.1109/CVPR.2018.00685

[17]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778. https://doi.org/10.1109/CVPR.2016.90

[18]

Steven Hickson, Nick Dufour, Avneesh Sud, Vivek Kwatra, and Irfan Essa. 2019. Eyemotion: Classifying facial expressions in VR using eye-tracking cameras. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 1626–1635. https://doi.org/10.1109/WACV.2019.00178

[19]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735

Digital Library

[20]

Bita Houshmand and Naimul Mefraz Khan. 2020. Facial expression recognition under partial occlusion from virtual reality headsets based on transfer learning. In 2020 IEEE Sixth International Conference on Multimedia Big Data (BigMM). IEEE, 70–75. https://doi.org/10.1109/BigMM50055.2020.00020

[21]

Xinya Ji, Hang Zhou, Kaisiyuan Wang, Qianyi Wu, Wayne Wu, Feng Xu, and Xun Cao. 2022. EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model. In ACM SIGGRAPH 2022 Conference Proceedings(SIGGRAPH ’22). 1–10. https://doi.org/10.1145/3528233.3530745

Digital Library

[22]

Anil Kag and Venkatesh Saligrama. 2021. Time adaptive recurrent neural network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 15149–15158. https://doi.org/10.1109/CVPR46437.2021.01490

[23]

Xavier Lagorce, Garrick Orchard, Francesco Galluppi, Bertram E Shi, and Ryad B Benosman. 2017. Hots: a hierarchy of event-based time-surfaces for pattern recognition. IEEE transactions on pattern analysis and machine intelligence 39, 7 (2017), 1346–1359. https://doi.org/10.1109/TPAMI.2016.2574707

Digital Library

[24]

Jiyoung Lee, Seungryong Kim, Sunok Kim, Jungin Park, and Kwanghoon Sohn. 2019. Context-aware emotion recognition networks. In Proceedings of the IEEE/CVF international conference on computer vision. 10143–10152. https://doi.org/10.1109/ICCV.2019.01024

[25]

Jiyoung Lee, Sunok Kim, Seungryong Kim, and Kwanghoon Sohn. 2020. Multi-modal recurrent attention networks for facial expression recognition. IEEE Transactions on Image Processing 29 (2020), 6977–6991. https://doi.org/10.1109/TIP.2020.2996086

[26]

Hao Li, Laura Trutoiu, Kyle Olszewski, Lingyu Wei, Tristan Trutna, Pei-Lun Hsieh, Aaron Nicholls, and Chongyang Ma. 2015. Facial Performance Sensing Head-Mounted Display. ACM Trans. Graph. 34, 4, Article 47 (jul 2015), 9 pages. https://doi.org/10.1145/2766939

Digital Library

[27]

Mi Li, Hongpei Xu, Xingwang Liu, and Shengfu Lu. 2018. Emotion recognition from multichannel EEG signals using K-nearest neighbor classification. Technology and Health Care 26 (04 2018), 509–519. https://doi.org/10.3233/THC-174836

Digital Library

[28]

Junxiu Liu, Guopei Wu, Yuling Luo, Senhui Qiu, Su Yang, Wei Li, and Yifei Bi. 2020. EEG-Based Emotion Classification Using a Deep Neural Network and Sparse Autoencoder. Frontiers in Systems Neuroscience 14 (2020). https://doi.org/10.3389/fnsys.2020.00043

[29]

Jorge C. Lucero and Kevin G. Munhall. 1999. A model of facial biomechanics for speech production.The Journal of the Acoustical Society of America 106 5 (1999), 2834–2842. https://doi.org/10.1121/1.428108

[30]

Ana I Maqueda, Antonio Loquercio, Guillermo Gallego, Narciso García, and Davide Scaramuzza. 2018. Event-based vision meets deep learning on steering prediction for self-driving cars. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5419–5427. https://doi.org/10.1109/CVPR.2018.00568

[31]

Sebastiaan Mathôt. 2018. Pupillometry: Psychology, Physiology, and Function. Journal of Cognition 1 (02 2018). https://doi.org/10.5334/joc.18

[32]

Seungjun Nah, Sanghyun Son, and Kyoung Mu Lee. 2019. Recurrent neural networks with intra-frame iterations for video deblurring. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8094–8103. https://doi.org/10.1109/CVPR.2019.00829

[33]

Jingping Nie, Yigong Hu, Yuanyuting Wang, Stephen Xia, and Xiaofan Jiang. 2020. SPIDERS: Low-Cost Wireless Glasses for Continuous In-Situ Bio-Signal Acquisition and Emotion Recognition. In 2020 IEEE/ACM Fifth International Conference on Internet-of-Things Design and Implementation (IoTDI). 27–39. https://doi.org/10.1109/IoTDI49375.2020.00011

[34]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 8024–8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf

Digital Library

[35]

Delian Ruan, Yan Yan, Shenqi Lai, Zhenhua Chai, Chunhua Shen, and Hanzi Wang. 2021. Feature decomposition and reconstruction learning for effective facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7660–7669. https://doi.org/10.1109/CVPR46437.2021.00757

[36]

Enrique Sanchez, Mani Kumar Tellamekala, Michel Valstar, and Georgios Tzimiropoulos. 2021. Affective Processes: stochastic modelling of temporal context for emotion and facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9074–9084. https://doi.org/10.1109/CVPR46437.2021.00896

[37]

B. Schuller, B. Vlasenko, F. Eyben, M. Wo?Llmer, A. Stuhlsatz, A. Wendemuth, and G. Rigoll. 2011. Cross-Corpus Acoustic Emotion Recognition: Variances and Strategies. IEEE Transactions on Affective Computing 1, 2 (2011), 119–131. https://doi.org/10.1109/T-AFFC.2010.8

Digital Library

[38]

Du Tran, Heng Wang, Lorenzo Torresani, Jamie Ray, Yann LeCun, and Manohar Paluri. 2018. A closer look at spatiotemporal convolutions for action recognition. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 6450–6459. https://doi.org/10.1109/CVPR.2018.00675

[39]

Yanxiang Wang, Xian Zhang, Yiran Shen, Bowen Du, Guangrong Zhao, Lizhen Cui Cui Lizhen, and Hongkai Wen. 2022. Event-Stream Representation for Human Gaits Identification Using Deep Neural Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 7 (2022), 3436–3449. https://doi.org/10.1109/TPAMI.2021.3054886

[40]

Hao Wu, Jinghao Feng, Xuejin Tian, Edward Sun, Yunxin Liu, Bo Dong, Fengyuan Xu, and Sheng Zhong. 2020. EMO: Real-time emotion recognition from single-eye images for resource-constrained eyewear devices. In Proceedings of the 18th International Conference on Mobile Systems, Applications, and Services. 448–461. https://doi.org/10.1145/3386901.3388917

Digital Library

[41]

Yujie Wu, Lei Deng, Guoqi Li, Jun Zhu, and Luping Shi. 2018. Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in neuroscience 12 (2018), 331. https://doi.org/10.3389/fnins.2018.00331

[42]

Fanglei Xue, Qiangchang Wang, and Guodong Guo. 2021. Transfer: Learning relation-aware facial expression representations with transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3601–3610. https://doi.org/10.1109/ICCV48922.2021.00358

[43]

Jiqing Zhang, Xin Yang, Yingkai Fu, Xiaopeng Wei, Baocai Yin, and Bo Dong. 2021b. Object tracking by jointly exploiting frame and event domain. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13043–13052. https://doi.org/10.1109/ICCV48922.2021.01280

[44]

Yuhang Zhang, Chengrui Wang, and Weihong Deng. 2021a. Relative Uncertainty Learning for Facial Expression Recognition. Advances in Neural Information Processing Systems 34 (2021), 17616–17627.

[45]

Zengqun Zhao and Qingshan Liu. 2021. Former-DFER: Dynamic Facial Expression Recognition Transformer. In Proceedings of the 29th ACM International Conference on Multimedia. 1553–1561. https://doi.org/10.1145/3474085.3475292

Digital Library

Cited By

Zhang JZhang MWang YLiu QYin BLi HYang X(2025)Spiking Neural Networks With Adaptive Membrane Time Constant for Event-Based TrackingIEEE Transactions on Image Processing10.1109/TIP.2025.353321334(1009-1021)Online publication date: 2025
https://doi.org/10.1109/TIP.2025.3533213
Wang YMei HBao QWei ZShou MLi HDong BYang XLarson K(2024)Apprenticeship-inspired eleganceProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/350(3160-3168)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/350
Sen ABandara NGokarn IKandappu TMisra A(2024)EyeTrAES: Fine-grained, Low-Latency Eye Tracking via Adaptive Event SlicingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997458:4(1-32)Online publication date: 21-Nov-2024
https://dl.acm.org/doi/10.1145/3699745
Show More Cited By

Index Terms

In the Blink of an Eye: Event-based Emotion Recognition
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
    1. Redundancy
  2. Embedded and cyber-physical systems
    1. Embedded systems
    2. Robotics
2. Networks
  1. Network properties
    1. Network reliability

Recommendations

Continuous Emotion Recognition in Videos by Fusing Facial Expression, Head Pose and Eye Gaze
ICMI '19: 2019 International Conference on Multimodal Interaction

Continuous emotion recognition is of great significance in affective computing and human-computer interaction. Most of existing methods for video based continuous emotion recognition utilize facial expression. However, besides facial expression, other ...
Neural Networks for Emotion Recognition Based on Eye Tracking Data
2015 IEEE International Conference on Systems, Man, and Cybernetics
We present an approach for emotion recognition using information of the pupil. In last years, the pupil variables have been used as an assessment of emotional arousal. In this article, we generate signals of pupil size and gaze position monitored during ...
Emotion Recognition Using Physiological Signals
MIDI '15: Proceedings of the Mulitimedia, Interaction, Design and Innnovation

In this paper the problem of emotion recognition using physiological signals is presented. Firstly the problems with acquisition of physiological signals related to specific human emotions are described. It is not a trivial problem to elicit real ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGGRAPH '23: ACM SIGGRAPH 2023 Conference Proceedings

July 2023

911 pages

ISBN:9798400701597

DOI:10.1145/3588432

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 July 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Key Research and Development Program of China
National Natural Science Foundation of China
Distinguished Young Scholars Funding of Dalian

Conference

SIGGRAPH '23

Sponsor:

SIGGRAPH

SIGGRAPH '23: Special Interest Group on Computer Graphics and Interactive Techniques Conference

August 6 - 10, 2023

CA, Los Angeles, USA

Acceptance Rates

Overall Acceptance Rate 1,822 of 8,601 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
644
Total Downloads

Downloads (Last 12 months)332
Downloads (Last 6 weeks)29

Reflects downloads up to 15 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang JZhang MWang YLiu QYin BLi HYang X(2025)Spiking Neural Networks With Adaptive Membrane Time Constant for Event-Based TrackingIEEE Transactions on Image Processing10.1109/TIP.2025.353321334(1009-1021)Online publication date: 2025
https://doi.org/10.1109/TIP.2025.3533213
Wang YMei HBao QWei ZShou MLi HDong BYang XLarson K(2024)Apprenticeship-inspired eleganceProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/350(3160-3168)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/350
Sen ABandara NGokarn IKandappu TMisra A(2024)EyeTrAES: Fine-grained, Low-Latency Eye Tracking via Adaptive Event SlicingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997458:4(1-32)Online publication date: 21-Nov-2024
https://dl.acm.org/doi/10.1145/3699745
Li YShankaran Vivekanand VKubendran RLee I(2024)Dynamic Neural Fields Accelerator Design for a Millimeter-Scale Tracking SystemIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2024.341672532:10(1940-1944)Online publication date: Oct-2024
https://doi.org/10.1109/TVLSI.2024.3416725
Han RLiu XZhang YZhou JTan HLi X(2024)Hierarchical Event-RGB Interaction Network for Single-eye Expression RecognitionInformation Sciences10.1016/j.ins.2024.121539(121539)Online publication date: Oct-2024
https://doi.org/10.1016/j.ins.2024.121539
Wang YDong BZhang YZhou YMei HWei ZYang XEl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)Event-Enhanced Multi-Modal Spiking Neural Network for Dynamic Obstacle AvoidanceProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612147(3138-3148)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3612147
Zhang JWang YLiu WLi MBai JYin BYang X(2023)Frame-Event Alignment and Fusion Network for High Frame Rate Tracking2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52729.2023.00943(9781-9790)Online publication date: Jun-2023
https://doi.org/10.1109/CVPR52729.2023.00943
Zhang JDong BFu YWang YWei XYin BYang X(2023)A Universal Event-Based Plug-In Module for Visual Object Tracking in Degraded ConditionsInternational Journal of Computer Vision10.1007/s11263-023-01959-8132:5(1857-1879)Online publication date: 18-Dec-2023
https://dl.acm.org/doi/10.1007/s11263-023-01959-8
Jin AWu ZZhu LXia QYang X(2023)Spiking Reinforcement Learning for Weakly-Supervised Anomaly DetectionNeural Information Processing10.1007/978-981-99-8073-4_14(175-187)Online publication date: 20-Nov-2023
https://dl.acm.org/doi/10.1007/978-981-99-8073-4_14

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten