skip to main content
10.1145/3512527.3531385acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Mobile Emotion Recognition via Multiple Physiological Signals using Convolution-augmented Transformer

Authors Info & Claims
Published:27 June 2022Publication History

ABSTRACT

Recognising and monitoring emotional states play a crucial role in mental health and well-being management. Importantly, with the widespread adoption of smart mobile and wearable devices, it has become easier to collect long-term and granular emotion-related physiological data passively, continuously, and remotely. This creates new opportunities to help individuals manage their emotions and well-being in a less intrusive manner using off-the-shelf low-cost devices. Pervasive emotion recognition based on physiological signals is, however, still challenging due to the difficulty to efficiently extract high-order correlations between physiological signals and users' emotional states. In this paper, we propose a novel end-to-end emotion recognition system based on a convolution-augmented transformer architecture. Specifically, it can recognise users' emotions on the dimensions of arousal and valence by learning both the global and local fine-grained associations and dependencies within and across multimodal physiological data (including blood volume pulse, electrodermal activity, heart rate, and skin temperature). We extensively evaluated the performance of our model using the K-EmoCon dataset, which is acquired in naturalistic conversations using off-the-shelf devices and contains spontaneous emotion data. Our results demonstrate that our approach outperforms the baselines and achieves state-of-the-art or competitive performance. We also demonstrate the effectiveness and generalizability of our system on another affective dataset which used affect inducement and commercial physiological sensors.

Skip Supplemental Material Section

Supplemental Material

ICMR22-icmrfp159.mp4

mp4

33 MB

References

  1. Lisa Feldman Barrett, Ralph Adolphs, Stacy Marsella, Aleix M Martinez, and Seth D Pollak. 2019. Emotional expressions reconsidered: Challenges to inferring emotion from human facial movements. Psychological science in the public interest 20, 1 (2019), 1--68.Google ScholarGoogle Scholar
  2. Behnam Behinaein, Anubhav Bhatti, Dirk Rodenburg, Paul Hungler, and Ali Etemad. 2021. A Transformer Architecture for Stress Detection from ECG. In 2021 International Symposium on Wearable Computers. 132--134.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Irwan Bello, Barret Zoph, Ashish Vaswani, Jonathon Shlens, and Quoc V Le. 2019. Attention augmented convolutional networks. In Proceedings of the IEEE/CVF international conference on computer vision. 3286--3295.Google ScholarGoogle ScholarCross RefCross Ref
  4. Ira Cohen, Ashutosh Garg, Thomas S Huang, et al . 2000. Emotion recognition from facial expressions using multilevel HMM. In Neural information processing systems, Vol. 2. Citeseer.Google ScholarGoogle Scholar
  5. Sylvain Delplanque and David Sander. 2021. A fascinating but risky case of reverse inference: From measures to emotions! Food Quality and Preference November 2020 (2021), 104183.Google ScholarGoogle Scholar
  6. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. IEEE, 248--255.Google ScholarGoogle ScholarCross RefCross Ref
  7. Elena Di Lascio, Shkurta Gashi, and Silvia Santini. 2019. Laughter recognition using non-invasive wearable devices. In Proceedings of the 13th EAI International Conference on Pervasive Computing Technologies for Healthcare. 262--271.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Sidney K D'mello and Jacqueline Kory. 2015. A review and meta-analysis of multimodal affect detection systems. Comput. Surveys 47, 3 (2015), 1--36.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jorge Goncalves, Pratyush Pandab, Denzil Ferreira, Mohammad Ghahramani, Guoying Zhao, and Vassilis Kostakos. 2014. Projective Testing of Diurnal Collective Emotion. In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp '14). 487--497.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Hector A Gonzalez, Shahzad Muzaffar, Jerald Yoo, and Ibrahim M Elfadel. 2020. BioCNN: A hardware inference engine for EEG-based emotion detection. IEEE Access 8 (2020), 140896--140914.Google ScholarGoogle ScholarCross RefCross Ref
  11. Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, et al. 2020. Conformer: Convolution-augmented transformer for speech recognition. arXiv preprint arXiv:2005.08100 (2020).Google ScholarGoogle Scholar
  12. Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning. PMLR, 448--456.Google ScholarGoogle Scholar
  13. John F Kihlstrom. 2021. Ecological validity and "ecological validity". Perspectives on Psychological Science 16, 2 (2021), 466--471.Google ScholarGoogle ScholarCross RefCross Ref
  14. Sander Koelstra, Christian Muhl, Mohammad Soleymani, Jong-Seok Lee, Ashkan Yazdani, Touradj Ebrahimi, Thierry Pun, Anton Nijholt, and Ioannis Patras. 2011. Deap: A database for emotion analysis; using physiological signals. IEEE transactions on affective computing 3, 1 (2011), 18--31.Google ScholarGoogle Scholar
  15. Azadeh Kushki, Jillian Fairley, Satyam Merja, Gillian King, and Tom Chau. 2011. Comparison of blood volume pulse and skin conductance responses to mental and affective stimuli at different anatomical sites. Physiological measurement 32, 10 (2011), 1529.Google ScholarGoogle Scholar
  16. Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. 2980--2988.Google ScholarGoogle ScholarCross RefCross Ref
  17. Terrance Liu, Paul Pu Liang, Michal Muszynski, Ryo Ishii, David Brent, Randy Auerbach, Nicholas Allen, and Louis-Philippe Morency. 2020. Multimodal privacy- preserving mood prediction from mobile data: A preliminary study. arXiv preprint arXiv:2012.02359 (2020).Google ScholarGoogle Scholar
  18. Steven Marwaha, Matthew R Broome, Paul E Bebbington, Elizabeth Kuipers, and Daniel Freeman. 2014. Mood instability and psychosis: analyses of British national survey data. Schizophrenia bulletin 40, 2 (2014), 269--277.Google ScholarGoogle Scholar
  19. Tin Lay Nwe, Say Wei Foo, and Liyanage C De Silva. 2003. Speech emotion recognition using hidden Markov models. Speech communication 41, 4 (2003), 603--623.Google ScholarGoogle Scholar
  20. Vassil Panayotov, Guoguo Chen, Daniel Povey, and Sanjeev Khudanpur. 2015. Librispeech: an asr corpus based on public domain audio books. In 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, 5206--5210.Google ScholarGoogle ScholarCross RefCross Ref
  21. Cheul Young Park, Narae Cha, Soowon Kang, Auk Kim, Ahsan Habib Khandoker, Leontios Hadjileontiadis, Alice Oh, Yong Jeong, and Uichin Lee. 2020. K-EmoCon, a multimodal sensor dataset for continuous emotion recognition in naturalistic conversations. Scientific Data 7, 1 (2020), 1--16.Google ScholarGoogle ScholarCross RefCross Ref
  22. Soujanya Poria, Erik Cambria, Rajiv Bajpai, and Amir Hussain. 2017. A review of affective computing: From unimodal analysis to multimodal fusion. Information Fusion 37 (2017), 98--125.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jonathan Posner, James A Russell, and Bradley S Peterson. 2005. The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology. Development and psychopathology 17, 3 (2005), 715--734.Google ScholarGoogle Scholar
  24. Jingyu Quan, Yoshihiro Miyake, and Takayuki Nozawa. 2021. Incorporating Interpersonal Synchronization Features for Automatic Emotion Recognition from Visual and Audio Data during Communication. Sensors 21, 16 (2021), 5317.Google ScholarGoogle ScholarCross RefCross Ref
  25. Mika Raento, Antti Oulasvirta, and Nathan Eagle. 2009. Smartphones: An emerging tool for social scientists. Sociological methods & research 37, 3 (2009), 426--454.Google ScholarGoogle Scholar
  26. Erika L Rosenberg and Paul Ekman. 1994. Coherence between expressive and experiential systems in emotion. Cognition & Emotion 8, 3 (1994), 201--229.Google ScholarGoogle ScholarCross RefCross Ref
  27. James A Russell. 1980. A circumplex model of affect. Journal of personality and social psychology 39, 6 (1980), 1161.Google ScholarGoogle ScholarCross RefCross Ref
  28. Sriparna Saha, Shreyasi Datta, Amit Konar, and Ramadoss Janarthanan. 2014. A study on emotion recognition from body gestures using Kinect sensor. In 2014 International Conference on Communication and Signal Processing. IEEE, 056--060.Google ScholarGoogle ScholarCross RefCross Ref
  29. Zhanna Sarsenbayeva, Gabriele Marini, Niels van Berkel, Chu Luo, Weiwei Jiang, Kangning Yang, Greg Wadley, Tilman Dingler, Vassilis Kostakos, and Jorge Goncalves. 2020. Does Smartphone Use Drive Our Emotions or Vice Versa? A Causal Analysis. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1--15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Elaine Sedenberg and John Chuang. 2017. Smile for the camera: privacy and policy implications of emotion AI. arXiv preprint arXiv:1709.00396 (2017).Google ScholarGoogle Scholar
  31. Lin Shu, Jinyan Xie, Mingyue Yang, Ziyi Li, Zhenqi Li, Dan Liao, Xiangmin Xu, and Xinyi Yang. 2018. A review of emotion recognition using physiological signals. Sensors 18, 7 (2018), 2074.Google ScholarGoogle ScholarCross RefCross Ref
  32. Yangyang Shu and Shangfei Wang. 2017. Emotion recognition through integrating EEG and peripheral signals. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2871--2875.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Mohammad Soleymani, Jeroen Lichtenauer, Thierry Pun, and Maja Pantic. 2011. A multimodal database for affect recognition and implicit tagging. IEEE transactions on affective computing 3, 1 (2011), 42--55.Google ScholarGoogle Scholar
  34. Isabel Straw. 2021. Ethical implications of emotion mining in medicine. Health Policy and Technology 10, 1 (2021), 191--195.Google ScholarGoogle ScholarCross RefCross Ref
  35. Ramanathan Subramanian, Julia Wache, Mojtaba Khomami Abadi, Radu L Vieriu, Stefan Winkler, and Nicu Sebe. 2016. ASCERTAIN: Emotion and personality recognition using commercial sensors. IEEE Transactions on Affective Computing 9, 2 (2016), 147--160.Google ScholarGoogle ScholarCross RefCross Ref
  36. Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, 11 (2008).Google ScholarGoogle Scholar
  37. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. arXiv preprint arXiv:1706.03762 (2017).Google ScholarGoogle Scholar
  38. Bin Wang, Chang Liu, Chuanyan Hu, Xudong Liu, and Jun Cao. 2021. Arrhythmia Classification with Heartbeat-Aware Transformer. In ICASSP 2021--2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1025--1029.Google ScholarGoogle Scholar
  39. Yi Wang, Zhiyi Huang, Brendan McCane, and Phoebe Neo. 2018. EmotioNet: A 3-D Convolutional Neural Network for EEG-based Emotion Recognition. In 2018 International Joint Conference on Neural Networks (IJCNN). 1--7. https://doi.org/10.1109/IJCNN.2018.8489715Google ScholarGoogle ScholarCross RefCross Ref
  40. Zhu Wang, Zhiwen Yu, Bobo Zhao, Bin Guo, Chao Chen, and Zhiyong Yu. 2020. EmotionSense: An Adaptive Emotion Recognition System Based on Wearable Smart Devices. ACM Transactions on Computing for Healthcare 1, 4 (2020), 1--17.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Tianyuan Xu, Ruixiang Yin, Lin Shu, and Xiangmin Xu. 2019. Emotion recognition using frontal eeg in vr affective scenes. In 2019 IEEE MTT-S International Microwave Biomedical Conference (IMBioC), Vol. 1. IEEE, 1--4.Google ScholarGoogle ScholarCross RefCross Ref
  42. Kangning Yang, Chaofan Wang, Yue Gu, Zhanna Sarsenbayeva, Benjamin Tag, Tilman Dingler, Greg Wadley, and Jorge Goncalves. 2021. Behavioral and Physiological Signals-Based Deep Multimodal Approach for Mobile Emotion Recognition. IEEE Transactions on Affective Computing (2021), 1--17.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Kangning Yang, Chaofan Wang, Zhanna Sarsenbayeva, Benjamin Tag, Tilman Dingler, Greg Wadley, and Jorge Goncalves. 2021. Benchmarking commercial emotion detection systems using realistic distortions of facial image datasets. The Visual Computer 37 (2021), 1447--1466.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Bobo Zhao, Zhu Wang, Zhiwen Yu, and Bin Guo. 2018. EmotionSense: Emotion recognition based on wearable wristband. In 2018 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI). IEEE, 346--355.Google ScholarGoogle Scholar
  45. Sicheng Zhao, Guiguang Ding, Jungong Han, and Yue Gao. 2018. Personality-Aware Personalized Emotion Recognition from Physiological Signals.. In IJCAI. 1660--1667.Google ScholarGoogle Scholar
  46. Junjie Zhu, Yuxuan Wei, Yifan Feng, Xibin Zhao, and Yue Gao. 2019. Physiological Signals-based Emotion Recognition via High-order Correlation Learning. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 15, 3s (2019), 1--18.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. M Sami Zitouni, Cheul Young Park, Uichin Lee, Leontios Hadjileontiadis, and Ahsan Khandoker. 2021. Arousal-Valence Classification from Peripheral Physiological Signals Using Long Short-Term Memory Networks. In 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). IEEE, 686--689.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Mobile Emotion Recognition via Multiple Physiological Signals using Convolution-augmented Transformer

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ICMR '22: Proceedings of the 2022 International Conference on Multimedia Retrieval
        June 2022
        714 pages
        ISBN:9781450392389
        DOI:10.1145/3512527

        Copyright © 2022 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 June 2022

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate254of830submissions,31%

        Upcoming Conference

        ICMR '24
        International Conference on Multimedia Retrieval
        June 10 - 14, 2024
        Phuket , Thailand

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader