DRS-Net: A spatial–temporal affective computing model based on multichannel EEG data
Introduction
Affective computing is a relatively new academic discipline that studies and develops systems and devices for recognizing, interpreting, processing, and simulating human affects. It is an interdisciplinary field based on computer science, psychology, and cognitive science that Rosalind Picard proposed in 1997 [1]. Current affective computing models usually use human behavioral data, including facial expression, body action, and speaking voice. Although the data are easy to obtain, they have low reliability and credibility. Especially, facial expressions may be involuntary or pretended, which cannot reflect genuine and objective emotions. Neuroimaging can more accurately respond to human emotion characteristics, but data collection is difficult, expensive, and inconvenient. In recent years, electroencephalography (EEG) data can intuitively and accurately reflect people’s affect states given their objective, natural, non-invasive, and practical characteristics. EEG data have gained attention among all possible physiological recording techniques.
A general affective computing framework based on EEG usually consists of four key steps: EEG data acquisition, data processing, feature extraction, and classification. Discriminative feature extraction is still a challenge for EEG-based affective computing. EEG data are discrete time series, and spatial, spectral, and temporal characteristics are reliably related to cognitive activities [2]. How to extract and select the EEG features effectively is a key factor to influence recognition results [3]. However, the most widely used feature extraction approaches for EEG data are hand-designed based on prior information or assumptions. And EEG features are usually decomposed into the time domain and frequency domain. Time domain features, such as the statistical information, Hjorth feature [4], fractal dimension feature [5], primarily capture the temporal information of EEG signals. The frequency domain features, such as the power spectral density (PSD) feature, differential entropy (DE) feature [6], and rational asymmetry feature [7], attempt to extract EEG emotion information from the frequency domain. Advanced signal processing approaches are also have been applied for EEG feature extraction, such as wavelet transforms [8], empirical mode decomposition (EMD) [9], multivariate empirical wavelet transform (MEWT) [10], and multivariate iterative filtering (MIF) [11]. However, EEG data are multi-channel data with rich nonlinear dynamics and high dimensional properties [12]. Most of the above approaches are based on linear models, without reflecting the complex nonlinear dynamics of human brain activity. In recent studies, nonlinear dynamic theory and chaos theory have been introduced to EEG data processing. Nonlinear features, such as maximum Lyapunov exponent [13], and sample entropy [4] are employed in EEG signal studies. Yang et al. analyzed 10 types of EEG features for emotion recognition [4], including both linear and nonlinear features, such as the standard deviation, PSD of four bands, and sample entropy. And then, they calculate the characteristics of each channel independently and integrate these features into a matrix as the final EEG data features. In practice, the simple way combines these features into a matrix without considering the correlation among multiple channels. It will neglect the interplay of data from different electrodes and cause information loss.
Another critical factor for EEG-based affective computing is how to devise a reliable emotion classifier. Many researchers have resorted to machine learning techniques, such as support vector machine (SVM), k-nearest neighbor (KNN), decision tree (DT), and random forest (RF), to identify the emotional states of a subject. Although these classification methods have good performance for image classification, object recognition, and speech classification, they show some disadvantages for EEG-based emotion recognition. The reason is that the EEG data are a more complex discrete time series with abundant spatial, temporal, and non-linear characteristics. How to effectively use these features for an EEG emotion classifier is still a challenging problem.
Recently, deep learning in neural networks has developed rapidly and has been successfully applied in emotion detection with EEG data [14], [15], [16], [17], [18], [19]. Zheng et al. constructed an EEG-based emotion recognition model by combing deep belief networks (DBNs) and DE features, achieving an accuracy of 86.08% on the three-classification SEED dataset [14]. Alhagry et al. proposed a recurrent neural network (RNN) long-short term memory (LSTM) model to learn temporal features from EEG data and classify them by a dense layer [15]. Zhang et al. proposed a two-layer RNN model to detect emotional states [16]. Inspired by the neuroscience findings of brain asymmetric, a bi-hemisphere domain adversarial neural network (BiDANN) model was proposed for emotion recognition [17]. Zheng et al. presented a multimodal framework named the EmotionMeter model for emotion detection based on deep neural networks [18]. Although these methods achieve good performance on EEG-based emotion classification, the existing methods either ignore the dynamic temporal information of EEG or need a long time to calculate the EEG features because of the high dimensionality of the EEG and the huge number of gradient descent training parameters in network training.
As an efficient implementation of reservoir computing (RC), the echo state network (ESN) was pioneered by Jaeger in 2001 [20]. ESN is a typical model for the recurrent neural network. ESN does not have the process of gradient descent training and has high dimensional non-linear mapping capabilities following short-term memory. Thus, it has been effectively utilized in time series processing. Recently, ESN has been used in EEG data decoding for direction in the brain-computer interface because EEG data are multichannel, high-dimensional, nonlinear time series [21]. Sun et al. [22] employed the trained output weights of the ESN to encode EEG features automatically and obtained good classification results for neurological disease diagnosis. Furthermore, ESN and its improvement have been utilized to detect emotional states with excellent results for the reservoir’s powerful nonlinear temporal mapping function [23], [24]. Among the studies, reservoir, the hidden layer with fixed weights of ESN, is an efficient component to obtain the EEG features.
Inspired by the advantages of the RC and LSTM network, we have proposed a novel multichannel EEG-driven affective computing model, namely Dynamic Reservoir State Network (DRS-Net) for emotion classification. The DRS-Net is an end-to-end encoder-decoder model that uses the constructed dynamic reservoir state as an encoder and an LSTM-dense model as a decoder. Compared with existing studies, in the encoding process, the proposed dynamic reservoir state encoder projects the raw multichannel EEG data into the high dimension reservoir layer to obtain all time steps reservoir states as a new multidimensional time series. Then, a one-step ridge regression linear prediction is performed to obtain the learned prediction weights and store them in a dynamic reservoir feature matrix as the EEG feature representation. Consequently, the dynamic reservoir feature matrix reflects the nonlinear dynamic EEG features, which take the temporal information and the spatial correlation of electrodes into consideration. In the decoding process, we flat the feature matrix to a one-dimensional sequence and fed it into the LSTM layer to extract the information from the encoder. A dropout layer is introduced between the LSTM layer and the dense layer to overcome overfitting. Then, a dense layer followed by a softmax is used to classify the subject’s emotional states. Experimental results demonstrate that our proposed DRS-Net model is a particularly efficient end-to-end spatial–temporal affective computing model based on multichannel EEG data. The primary contributions can be summarized as follows:
- (1)
We propose an end-to-end affective computing framework, namely DRS-Net, which is consisting of a dynamic reservoir state encoder and an LSTM-dense decoder.
- (2)
We design a dynamic reservoir state encoder model to automatically extract the spatial–temporal features of multichannel signals. Results show that the encoder is effective in representing dynamic non-linear EEG signal features with high speed and low complexity.
- (3)
We present a possibility for integrating RC with the deep neural network paradigm to deal with nonlinear time series, which combines the efficiency of the reservoir layer of the RC model in representing time series and the efficacy of LSTM in sequence data a unified framework.
The rest of this paper is organized as follows. Section 2 introduces the EEG Datasets used. Section 3 presents the proposed DRS-Net model in detail. Section 4 gives the experiments and result discussion from three datasets. Section 5 contains the conclusion.
Section snippets
EEG data preparation
Three commonly used public datasets were employed for analysis: SEED, SEED-IV, and DEAP. These benchmark datasets are briefly introduced below.
Framework of the proposed DRS-Net
In this section, we described in detail the construction of the proposed DRS-Net model. Unlike the traditional hand-designed feature selection-based classifier method, we proposed a multichannel EEG-driven spatial–temporal affective computing framework, an end-to-end model. The framework of the proposed method is shown in Fig. 3.
The proposed method consists of a dynamic reservoir state encoder and an LSTM-dense decoder. In the encoding stage, pre-processed multichannel EEG data are projected
Experiments and results analysis
To evaluate the effectiveness of the proposed DRS-Net affective computing model, we have conducted experiments on three public EEG datasets, SEED, SEED-IV, and DEAP. Original hand-selected EEG features, such as DE, FD, PSD, and STA (including six statistics features: mean, standard deviation, mean of absolute values of the first differences, mean of absolute values of the first differences of normalized EEG, mean of absolute values of the second differences, and the mean of the absolute values
Conclusions
This study proposes a physiological data-driven affective computing architecture based on multi-channel EEG data, namely DRS-Net. The proposed end-to-end model is inspired by the good performance of RC. Thus, a dynamic reservoir state encoder model is constructed to extract spatial–temporal information of multichannel EEG data. Given its high dimensional non-linear mapping capabilities and short-ter m memory, the dynamic reservoir state encoder model can obtain dynamic high dimension nonlinear
Data availability statement
Publicly available datasets were analyzed in this study. These datasets can be found at: DEAP dataset https://www.eecs.qmul.ac.uk/mmv/datasets/deap/index.html; SEED dataset and SEED-IV dataset https://bcmi.sjtu.edu.cn/home/seed/index.html.
Funding
This work is partially supported by the National Natural Science Foundation of China (No.11772178, No. 11872036, and No. 61907028), the Fundamental Research Fund for the Central Universities (No. 2017CBY008, and No. GK202101004), and the Shaanxi Key Science and Technology Innovation Team Project (No. 2022TD-26).
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (49)
Where does EEG come from and what does it mean?
Trends Neurosci.
(2017)- et al.
Schizophrenia detection technique using multivariate iterative filtering and multichannel EEG signals
Biomed. Signal Process. Control
(2021) - et al.
Recurrent neural networks employing Lyapunov exponents for EEG signals classification
Expert Syst. Appl.
(2005) - et al.
Reservoir computing approaches to recurrent neural network training
Comput. Sci. Rev.
(2009) - et al.
Decoding electroencephalographic signals for direction in brain-computer interface using echo state network and Gaussian readouts
Comput. Biol. Med.
(2019) - et al.
Unsupervised EEG feature extraction based on echo state network
Inf. Sci.
(2019) - et al.
Time series classification with echo memory networks
Neural networks
(2019) - et al.
Individual differences in emotion processing
Curr. Opin. Neurobiol.
(2004) - et al.
Leveraging spatial-temporal convolutional features for EEG-based emotion recognition
Biomed. Signal Process. Control
(2021) - et al.
Affective states classification using EEG and semi-supervised deep learning approaches
Affective Computing
Feature extraction and selection for emotion recognition from EEG
IEEE Trans. Affect. Comput.
Multi-method fusion of cross-subject emotion recognition based on high-dimensional EEG features
Front. Comput. Neurosci.
Real-time fractal-based valence level recognition from EEG
Differential entropy feature for EEG-based emotion classification
EEG-based emotion recognition in music listening
IEEE Trans. Biomed. Eng.
Identification of time-varying systems using multi-wavelet basis functions
IEEE Trans. Control Syst. Technol.
Electroencephalogram emotion recognition based on empirical mode decomposition and optimal feature selection
IEEE Tran. Cogn. Dev. Syst.
A multivariate approach for patient-specific EEG seizure detection using empirical wavelet transform
IEEE Trans. Biomed. Eng.
Subject-independent emotion recognition of EEG signals based on dynamic empirical convolutional neural network
IEEE/ACM Trans. Comput. Biol. Bioinf.
Investigating critical frequency bands and channels for EEG-based emotion recognition with deep neural networks
IEEE Trans. Auton. Ment. Dev.
Emotion recognition based on EEG using LSTM recurrent neural network
Emotion
Spatial-temporal recurrent neural network for emotion recognition
IEEE Trans. Cybern.
A bi-hemisphere domain adversarial neural network model for EEG emotion recognition
IEEE Trans. Affective Comput.
Cited by (17)
A subject-independent portable emotion recognition system using synchrosqueezing wavelet transform maps of EEG signals and ResNet-18
2024, Biomedical Signal Processing and ControlSpectrum-based channel attention cooperating with time continuity encoding in transformer for EEG emotion analysis
2024, Biomedical Signal Processing and ControlWeighted knowledge distillation of attention-LRCN for recognizing affective states from PPG signals
2023, Expert Systems with ApplicationsTowards the Prognosis of Patients in Coma using Echo State Networks for EEG Analysis
2023, Procedia Computer ScienceCoDF-Net: coordinated-representation decision fusion network for emotion recognition with EEG and eye movement signals
2024, International Journal of Machine Learning and CyberneticsNPT-UL: An Underwater Image Enhancement Framework Based on Nonphysical Transformation and Unsupervised Learning
2024, IEEE Transactions on Geoscience and Remote Sensing