research-article

A Multi-view Spectral-Spatial-Temporal Masked Autoencoder for Decoding Emotions with Self-supervised Learning

Authors:

Wei-Long Zheng,

Bao-Liang LuAuthors Info & Claims

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Pages 6 - 14

https://doi.org/10.1145/3503161.3548243

Published: 10 October 2022 Publication History

Abstract

Affective Brain-computer Interface has achieved considerable advances that researchers can successfully interpret labeled and flawless EEG data collected in laboratory settings. However, the annotation of EEG data is time-consuming and requires a vast workforce which limits the application in practical scenarios. Furthermore, daily collected EEG data may be partially damaged since EEG signals are sensitive to noise. In this paper, we propose a Multi-view Spectral-Spatial-Temporal Masked Autoencoder (MV-SSTMA) with self-supervised learning to tackle these challenges towards daily applications. The MV-SSTMA is based on a multi-view CNN-Transformer hybrid structure, interpreting the emotion-related knowledge of EEG signals from spectral, spatial, and temporal perspectives. Our model consists of three stages: 1) In the generalized pre-training stage, channels of unlabeled EEG data from all subjects are randomly masked and later reconstructed to learn the generic representations from EEG data; 2) In the personalized calibration stage, only few labeled data from a specific subject are used to calibrate the model; 3) In the personal test stage, our model can decode personal emotions from the sound EEG data as well as damaged ones with missing channels. Extensive experiments on two open emotional EEG datasets demonstrate that our proposed model achieves state-of-the-art performance on emotion recognition. In addition, under the abnormal circumstance of missing channels, the proposed model can still effectively recognize emotions.

References

[1]

Salma Alhagry, Aly Aly Fahmy, and Reda A El-Khoribi. 2017. Emotion recognition based on EEG using LS™ recurrent neural network. Emotion, Vol. 8, 10 (2017), 355--358.

[2]

Hubert Banville, Omar Chehab, Aapo Hyv"arinen, Denis-Alexander Engemann, and Alexandre Gramfort. 2021. Uncovering the structure of clinical EEG signals with self-supervised learning. Journal of Neural Engineering, Vol. 18, 4 (2021), 046020.

[3]

Andrey V Bocharov, Gennady G Knyazev, and Alexander N Savostyanov. 2017. Depression and implicit emotion processing: An EEG study. Neurophysiologie Clinique/Clinical Neurophysiology, Vol. 47, 3 (2017), 225--230.

[4]

Clemens Brunner, Niels Birbaumer, Benjamin Blankertz, Christoph Guger, Andrea Kübler, Donatella Mattia, José del R Millán, Felip Miralles, Anton Nijholt, Eloy Opisso, et al. 2015. BNCI Horizon 2020: towards a roadmap for the BCI community. Brain-Computer Interfaces, Vol. 2, 1 (2015), 1--10.

[5]

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning. PMLR, 1597--1607.

[6]

Ruo-Nan Duan, Jia-Yi Zhu, and Bao-Liang Lu. 2013. Differential entropy feature for EEG-based emotion classification. In 2013 6th International IEEE/EMBS Conference on Neural Engineering (NER). IEEE, 81--84.

[7]

Lester I Goldfischer. 1965. Autocorrelation function and power spectral density of laser-produced speckle patterns. Josa, Vol. 55, 3 (1965), 247--253.

[8]

Matti H"am"al"ainen, Riitta Hari, Risto J Ilmoniemi, Jukka Knuutila, and Olli V Lounasmaa. 1993. Magnetoencephalography-theory, instrumentation, and applications to noninvasive studies of the working human brain. Reviews of Modern Physics, Vol. 65, 2 (1993), 413.

[9]

Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick. 2021. Masked autoencoders are scalable vision learners. arXiv preprint arXiv:2111.06377 (2021).

[10]

Ziyu Jia, Youfang Lin, Xiyang Cai, Haobin Chen, Haijun Gou, and Jing Wang. 2020. SST-EmotionNet: Spatial-spectral-temporal based attention 3d dense network for EEG emotion recognition. In Proceedings of the 28th ACM International Conference on Multimedia. 2909--2917.

Digital Library

[11]

Xue Jiang, Jianhui Zhao, Bo Du, and Zhiyong Yuan. 2021. Self-supervised Contrastive Learning for EEG-based Sleep Staging. In 2021 International Joint Conference on Neural Networks (IJCNN). IEEE, 1--8.

[12]

Demetres Kostas, Stephane Aroca-Ouellette, and Frank Rudzicz. 2021. BENDR: using transformers and a contrastive self-supervised learning task to learn from massive amounts of EEG data. arXiv preprint arXiv:2101.12037 (2021).

[13]

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. nature, Vol. 521, 7553 (2015), 436--444.

[14]

Rui Li, Yiting Wang, and Bao-Liang Lu. 2021. A Multi-Domain Adaptive Graph Convolutional Network for EEG-based Emotion Recognition. In Proceedings of the 29th ACM International Conference on Multimedia. 5565--5573.

Digital Library

[15]

Yang Li, Lei Wang, Wenming Zheng, Yuan Zong, Lei Qi, Zhen Cui, Tong Zhang, and Tengfei Song. 2020. A novel bi-hemispheric discrepancy model for EEG emotion recognition. IEEE Transactions on Cognitive and Developmental Systems, Vol. 13, 2 (2020), 354--367.

[16]

Yang Li, Wenming Zheng, Zhen Cui, Tong Zhang, and Yuan Zong. 2018. A Novel Neural Network Model based on Cerebral Hemispheric Asymmetry for EEG Emotion Recognition. In IJCAI. 1561--1567.

[17]

Yang Li, Wenming Zheng, Lei Wang, Yuan Zong, and Zhen Cui. 2019. From regional to global brain: A novel hierarchical spatial-temporal neural network model for EEG emotion recognition. IEEE Transactions on Affective Computing (2019).

[18]

Xiao Liu, Fanjin Zhang, Zhenyu Hou, Li Mian, Zhaoyu Wang, Jing Zhang, and Jie Tang. 2021. Self-supervised learning: Generative or contrastive. IEEE Transactions on Knowledge and Data Engineering (2021).

[19]

Yisi Liu and Olga Sourina. 2013. Real-time fractal-based valence level recognition from EEG. In Transactions on computational science XVIII. Springer, 101--120.

[20]

Mostafa Neo Mohsenvand, Mohammad Rasool Izadi, and Pattie Maes. 2020. Contrastive representation learning for electroencephalogram classification. In Machine Learning for Health. PMLR, 238--253.

[21]

Femke Nijboer, Fabrice O Morin, Stefan P Carmien, Randal A Koene, Enrique Leon, and Ulrich Hoffmann. 2009. Affective brain-computer interfaces: Psychophysiological markers of emotion in healthy persons and in persons with amyotrophic lateral sclerosis. In 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops. IEEE, 1--11.

[22]

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic Differentiation in Pytorch. (2017).

[23]

Tengfei Song, Wenming Zheng, Peng Song, and Zhen Cui. 2018. EEG emotion recognition using dynamical graph convolutional neural networks. IEEE Transactions on Affective Computing, Vol. 11, 3 (2018), 532--541.

[24]

Babak A Taheri, Robert T Knight, and Rosemary L Smith. 1994. A dry electrode for EEG recording. Electroencephalography and Clinical Neurophysiology, Vol. 90, 5 (1994), 376--383.

[25]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 5998--6008.

[26]

Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol. 2008. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th international conference on Machine learning. 1096--1103.

Digital Library

[27]

Tong Zhang, Wenming Zheng, Zhen Cui, Yuan Zong, and Yang Li. 2018. Spatial--temporal recurrent neural network for emotion recognition. IEEE Transactions on Cybernetics, Vol. 49, 3 (2018), 839--847.

[28]

Wei-Long Zheng, Wei Liu, Yifei Lu, Bao-Liang Lu, and Andrzej Cichocki. 2018. Emotionmeter: A multimodal framework for recognizing human emotions. IEEE Transactions on Cybernetics, Vol. 49, 3 (2018), 1110--1122.

[29]

Wei-Long Zheng and Bao-Liang Lu. 2015. Investigating critical frequency bands and channels for EEG-based emotion recognition with deep neural networks. IEEE Transactions on Autonomous Mental Development, Vol. 7, 3 (2015), 162--175.

Digital Library

[30]

Wei-Long Zheng, Jia-Yi Zhu, and Bao-Liang Lu. 2017. Identifying stable patterns over time for emotion recognition from EEG. IEEE Transactions on Affective Computing, Vol. 10, 3 (2017), 417--429.

Digital Library

[31]

Peixiang Zhong, Di Wang, and Chunyan Miao. 2020. EEG-based emotion recognition using regularized graph neural networks. IEEE Transactions on Affective Computing (2020).io

Cited By

Ma SZhang YLiu YChen YYang WYang JZhao MJia Z(2025)CCAM: Cross-Channel Association Mining for Ubiquitous Sleep StagingIEEE Internet of Things Journal10.1109/JIOT.2024.347773212:3(2713-2724)Online publication date: 1-Feb-2025
https://doi.org/10.1109/JIOT.2024.3477732
Zhang SAn DLiu JWei Y(2025)EEG generalizable representations learning via masked fractional fourier domain modelingApplied Soft Computing10.1016/j.asoc.2025.112731(112731)Online publication date: Jan-2025
https://doi.org/10.1016/j.asoc.2025.112731
Wu XZhang YLi JYang HWu X(2024)FC-TFS-CGRU: A Temporal–Frequency–Spatial Electroencephalography Emotion Recognition Model Based on Functional Connectivity and a Convolutional Gated Recurrent Unit Hybrid ArchitectureSensors10.3390/s2406197924:6(1979)Online publication date: 20-Mar-2024
https://doi.org/10.3390/s24061979
Show More Cited By

Index Terms

A Multi-view Spectral-Spatial-Temporal Masked Autoencoder for Decoding Emotions with Self-supervised Learning
1. Computing methodologies
  1. Artificial intelligence
    1. Philosophical/theoretical foundations of artificial intelligence
      1. Cognitive science
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. HCI design and evaluation methods

Recommendations

Self-Supervised EEG Representation Learning for Robust Emotion Recognition
Emotion recognition based on electroencephalography (EEG) is becoming a growing concern of researchers due to its various applications and portable devices. Existing methods are mainly dedicated to EEG feature representation and have made impressive ...
A Multi-Domain Adaptive Graph Convolutional Network for EEG-based Emotion Recognition
MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Among all solutions of emotion recognition tasks, electroencephalogram (EEG) is a very effective tool and has received broad attention from researchers. In addition, information across multimedia in EEG often provides a more complete picture of ...
Thyroid ultrasound diagnosis improvement via multi-view self-supervised learning and two-stage pre-training
Abstract
Thyroid nodule classification and segmentation in ultrasound images are crucial for computer-aided diagnosis; however, they face limitations owing to insufficient labeled data. In this study, we proposed a multi-view contrastive self-supervised ...
Highlights
- We proposed a multi-view SSL method that is not constrained by paired views.
- We adopted a two-stage pre-training strategy on thyroid ultrasound images.
- Extensive experiments were conducted on a large thyroid ultrasound image ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

October 2022

7537 pages

ISBN:9781450392037

DOI:10.1145/3503161

General Chairs:
João Magalhães
NOVA University of Lisbon, Portugal
,
Alberto del Bimbo
University of Florence, Italy
,
Shin'ichi Satoh
National Institute of Informatics, Japan
,
Nicu Sebe
University of Trento, Italy
,
Program Chairs:
Xavier Alameda-Pineda
Inria, Grenoble, France
,
Qin Jin
Renmin University of China, China
,
Vincent Oria
New Jersey Institute of Technology, USA
,
Laura Toni
University College London, UK

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 October 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

MOST 2030 Brain Project
SJTU Global Strategic Partnership Fund
Shanghai Municipal Science and Technology Major Project
National Natural Science Foundation of China-Liaoning Joint Fund

Conference

MM '22

Sponsor:

SIGMM

MM '22: The 30th ACM International Conference on Multimedia

October 10 - 14, 2022

Lisboa, Portugal

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

26
Total Citations
View Citations
2,105
Total Downloads

Downloads (Last 12 months)566
Downloads (Last 6 weeks)17

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ma SZhang YLiu YChen YYang WYang JZhao MJia Z(2025)CCAM: Cross-Channel Association Mining for Ubiquitous Sleep StagingIEEE Internet of Things Journal10.1109/JIOT.2024.347773212:3(2713-2724)Online publication date: 1-Feb-2025
https://doi.org/10.1109/JIOT.2024.3477732
Zhang SAn DLiu JWei Y(2025)EEG generalizable representations learning via masked fractional fourier domain modelingApplied Soft Computing10.1016/j.asoc.2025.112731(112731)Online publication date: Jan-2025
https://doi.org/10.1016/j.asoc.2025.112731
Wu XZhang YLi JYang HWu X(2024)FC-TFS-CGRU: A Temporal–Frequency–Spatial Electroencephalography Emotion Recognition Model Based on Functional Connectivity and a Convolutional Gated Recurrent Unit Hybrid ArchitectureSensors10.3390/s2406197924:6(1979)Online publication date: 20-Mar-2024
https://doi.org/10.3390/s24061979
Xiao MBo S(2024)Electroencephalogram Emotion Recognition via AUC MaximizationAlgorithms10.3390/a1711048917:11(489)Online publication date: 1-Nov-2024
https://doi.org/10.3390/a17110489
Ahuja CSethia D(2024)Harnessing Few-Shot Learning for EEG signal classification: a survey of state-of-the-art techniques and future directionsFrontiers in Human Neuroscience10.3389/fnhum.2024.142192218Online publication date: 10-Jul-2024
https://doi.org/10.3389/fnhum.2024.1421922
Wang YZhang BDi L(2024)Research Progress of EEG-Based Emotion Recognition: A SurveyACM Computing Surveys10.1145/366600256:11(1-49)Online publication date: 8-Jul-2024
https://dl.acm.org/doi/10.1145/3666002
Jiang WLan YLu BCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)REmoNet: Reducing Emotional Label Noise via Multi-regularized Self-supervisionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681406(2204-2213)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681406
Pang MWang HHuang JVong CZeng ZChen C(2024)Multi-Scale Masked Autoencoders for Cross-Session Emotion RecognitionIEEE Transactions on Neural Systems and Rehabilitation Engineering10.1109/TNSRE.2024.338903732(1637-1646)Online publication date: 2024
https://doi.org/10.1109/TNSRE.2024.3389037
Gao LHu HXue XHu H(2024)From Appearance to Inherence: A Hyperspectral Image Dataset and Benchmark of Material Classification for SurveillanceIEEE Transactions on Multimedia10.1109/TMM.2024.337987626(8569-8580)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2024.3379876
Fu ZZhu HZhao YHuan RZhang YChen SPan Y(2024)GMAEEG: A Self-Supervised Graph Masked Autoencoder for EEG Representation LearningIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2024.344365128:11(6486-6497)Online publication date: Nov-2024
https://doi.org/10.1109/JBHI.2024.3443651
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten