Classification of multiple motor imagery using deep convolutional neural networks and spatial filters
Introduction
Research in the development of new human–computer interaction systems is being continued, among them are the brain–computer interfaces (BCI). These systems allow users to communicate or manipulate external devices by using only electrophysiological measures of their brain activity [1]. The flow of electric currents causes this electric brain activity during synaptic excitation of the dendrites in the neurons. Commonly non-invasive recording procedures are used since they are safer for the subjects, more comfortable to implement and can be used by most of the population, including people with physical disabilities [2], [3], [4]. The signals acquired by this procedure are known as electroencephalographic signals (EEG). From the EEG signals, several brain activity patterns can be detected which allow determining the user intentions, one of those are the patterns generated by motor imagery. Motor imagery (MI) is when the user imagines the execution of certain physical activity such as open and close a hand or foot movement. When motor imagery occurs, there is energy suppression of the Mu (8–12 Hz) and Beta (16–24 Hz) rhythms, phenomena which is known as event-related desynchronization (ERD). At the end of the motor imagery execution, an energy enhancement of the rhythms occurs, this event is called event-related synchronization (ERS).
In the case of multiple motor imagery classification, several methods have been proposed. Among the most successful ones is the generation of spatial filters by using the Common Spatial Pattern (CSP) algorithm. These filters are used to extract discriminative spatial patterns that contrast the energy features in different motor imageries [5]. By applying the spatial filters, the EEG signals are projected onto a subspace where they present higher discrimination between the two motor imageries involved [5]. From this same algorithm, variants have emerged with the aim of increasing the classification performance such as Filter Bank Common Spatial Pattern (FBCSP) [6] and Discriminative Filter Bank Common Spatial Pattern (DFBCSP) [7]. These variants have shown that applying CSP at different frequency bands improves the classification accuracy since these methods are capable of learning subject-specific patterns from the high-dimensional EEG measurements. The most effective variant has been the DFBCSP since it can identify the most discriminative frequency bands according to the motor imagery and subject, making this method a more robust approach for the classification of non-stationary motor imagery EEG signals, unlike FBCSP were all the frequency bands are fixed for all subjects. These feature extraction methods are typically used in conventional classification methodologies along with algorithms such as Linear Discriminant Analysis (LDA) [8], artificial neural networks (ANN) [9] or fuzzy systems [10]. However, these conventional methods like others [11], [12], present systems with average accuracies below the 80% and most of them solve only binary classification problems. Therefore, the development of methodologies with better performance is still-ongoing.
The existent feature extraction methods for MI-BCI mainly focus on the extraction of static energy features, neglecting the dynamics of the signal during the execution of motor imagery; when static energy features are extracted, the energy dynamics is reduced into a single number and hence, temporal information is ignored. Pfurtscheller et al. [13] have proved that foot and tongue motor imageries have a more tenuous drop in energy than the left or right-hand motor imagery, they are more likely to produce an energy pattern in certain channels and frequencies. Therefore, it is necessary to develop a methodology that can extract temporal information and collaborate with static energy features to build a classifier which can then handle a wider variety of motor imageries; one possible and relatively new approach might be found in the deep learning branch of machine learning. These algorithms have the potential for detecting features in a wider range of data that had been overlooked. The algorithm that has reached promising results in other areas is the Convolutional Neural Network (CNN), originally proposed by Y. LeCun et al. [14]. Consequently, this paper evaluates the performance of CNNs to detect dynamic energy-based features. Convolution can detect patterns basing on a learned kernel and can be used as a method to detect events in signal processing. CNNs have been previously used for classification of EEG signals and have achieved better performance than shallow classification algorithms. Among the works aimed to classify different motor imageries, there is the one done by Uktveris, et al. [15], where a classification of four-class motor imagery was carried out by employing diverse feature extraction methods, achieving an average accuracy of 68%. Another work found for the classification of four-class motor imagery was the one presented by Yang et al. [9], his system extracted features using an augmented CSP algorithm and attained an average accuracy of 68.45%. For the solution of binary classification problems, Tang et al. [16] achieved an average accuracy of 86.41% by using a five-layer CNN for the classification between hand motor imageries. Despite all these works, it is still needed more research on developing multi-class BCI with better performance and with online capabilities for real-world applications. Therefore, this paper presents two new methods for the classification of multiple motor imageries and an idle state (a state without any motor imagery) applying CNNs. To classify these mental states both methods employ features extracted using CSP in different frequencies bands, following an improved method of DFBCSP presented in this paper, where the bandpass selection procedure determines the best frequency bands considering the mental states that will be discriminated. In this paper a monolithic method using a single CNN and a modular method using four expert CNNs and a fully connected network are implemented, both for the classification of multiple classes (MI of the left-hand, right-hand, left-foot, right-foot, tongue and an idle state), this with the aim to analyze the performance of both methods.
Because the information to design an optimal CNN structure is scarce, the selection of some structure and training hyperparameters was done by applying a Bayesian optimization. Snoek et al. [17] demonstrated that the Bayesian optimization surpasses a human expert at selecting hyperparameters and as a result beat the hyperparameters selection procedures by over 3%. The optimized hyperparameters are the number of convolutional layers, the initial learning rate, the stochastic gradient descent momentum and the L2 regularization strength. The Bayesian optimization [18] typically works assuming that the unknown function was sampled from a Gaussian process (GP) and the posterior distribution from this function is computed in each observation performed [17]. The GP is a probability function over a random function, which can have an infinite set of variables in such a way that any finite random variables subset has a multivariable Gaussian distribution [19]. This work uses the observations with different hyperparameters configurations as a general performance measure. The selection of the hyperparameters for the next experiment is carried out by minimizing the classification error on the validation set.
Most of the datasets available have only two types of motor imagery EEG, commonly hands movements or only one-foot motor imagery, therefore to evaluate the performance of the proposed methods with more than two motor imagery EEG the BCI competition IV dataset 2a was used. This dataset was chosen since the primary objective of this paper is to decode multi-class motor imagery EEG and because it is one of the most used datasets in the scientific literature. To complement the evaluation of this study a new dataset was recorded from eight subjects using the OpenBCI device with a set of 8 electrodes, this with the aim of facilitating the use of MI-BCI that commonly requires more than 20 electrodes for a multi-class classification [20], [21]. Furthermore, the class of the idle state is commonly overlooked by the existing BCI; therefore in this work, it was added as a class in the new dataset since it is essential to recognize it if the proposed methods are intended to be implemented in a realistic application.
In summary, the original contributions of the paper are: evaluation of two CNN methods (monolithic and modular architecture) for multiple motor imagery classification, optimization of CNNs by Bayesian optimization, development of a frequency band selection algorithm, and generation of a multi-motor imagery dataset that includes an idle state. The paper is organized in the following sections. Section 2 describes the datasets used. The general scheme of the BCI designed is explained in Section 3. Section 4 presents the pre-processing and implementation of CSP. In Section 5 the structure of the input matrices for the CNNs is explained as well as the Bayesian optimization, it also presents the resulting optimized hyperparameters. The performance of the proposed methods is shown in Section 6. Finally, the conclusions are discussed in Section 7.
Section snippets
Datasets
The datasets used for this work were the BCI competition IV dataset 2a and another acquired by using the OpenBCI device. The dataset 2a [21] consist of four different motor imagery tasks: left hand, right hand, both feet, and tongue movements. Nine subjects executed the trials for one training session and one test session; there are in total 288 trials per session, 72 trials per class. These trials were executed with authorization of the ethics committee of our institution under authorization
General scheme
The design of the systems proposed is divided into four stages, the first consist of the selection of the most discriminant frequency bands for the classification of each binary class combination (one frequency band for each combination). The next stage presents the design of the spatial filters. In the third stage, a Bayesian optimization is applied to find the optimal hyperparameters for the structure and training of the convolutional neural networks. Finally, the last stage illustrates the
Frequency bands selection
The amplitude modulation in the sensorimotor rhythms can be detected between the Mu and Beta bands. The frequency band range of Mu and Beta varies in the literature. For example, Lotte et al. [1] considers Mu and Beta as 8–12 Hz and 16–24 Hz, respectively, however, Pfurtscheller et al. [12] established the bands Mu 10–12 Hz and Beta 14–18 Hz, and Chai et al. [27] Mu 8–13 Hz and Beta 14–30 Hz. Even if a frequency range among the previously mentioned is selected, there exist sub-bands that give
Implementation of convolutional neural networks
For the solution of the multi-class classification problem, a monolithic network (a single CNN) and a modular network were designed. The modular network was composed of four expert CNNs, each one classifies one of the following binary combinations, for the dataset 2a: hands vs. feet, hands vs. tongue, left hand vs. right hand, and feet vs. tongue; and for the OpenBCI dataset: motor imagery vs. idle, hands vs. feet, left hand vs. right hand, and left foot vs. right foot. In this section, the
Results and Discussion
In this section, the classification performance of the monolithic and modular network is presented. The performances of the four expert CNNs that composed the modular network are also presented in order to carry out an analysis on how well they classify their corresponding classes.
For the BCI Competition IV dataset 2a, both methods were trained with the training session and tested with the testing session separately for each subject. As for the OpenBCI dataset, the evaluation was carried out by
Conclusions
In this work, a monolithic network and modular network were designed for the multiclass classification of motor imagery EEG signals. The classes involved are several motor imageries and a mental state where the subject does not execute any motor imagery. Both approaches used optimized convolutional neural networks to classify and extract information from matrices composed of features obtained by using a variant of DFBCSP, processing the data in both spectral and spatial domain. The frequency
Acknowledgment
The authors wish to extend their thanks to Tecnológico Nacional de México for the support provided to carry out this work under the project 5684.16-P.
References (44)
- et al.
A new hybrid BCI paradigm based on P300 and SSVEP
J. Neurosci. Methods
(2015) - et al.
Interval type-2 fuzzy logic based multiclass ANFIS algorithm for real-time EEG based movement control of a robot arm
Robot. Auton. Syst.
(2015) - et al.
EEG-based discrimination between imagination of right and left hand movement
Electroencephalogr. Clin. Neurophysiol.
(1997) - et al.
Single-trial EEG classification of motor imagery using deep convolutional neural networks
Optik
(2017) - et al.
Simultaneous and independent control of a brain-computer interface and contralateral limb movement
Brain-Comput. Interfaces
(2015)- et al.
A novel hybrid BCI speller based on the incorporation of SSVEP into the P300 paradigm
J. Neural Eng.
(2013) Decimation filter with common spatial pattern and fishers discriminant analysis for motor imagery classification
- et al.
Filter bank common spatial pattern (FBCSP) in brain-computer interface
- et al.
A New discriminative common spatial pattern method formotor imagery brain–computer interfaces
Trans. Biomed. Eng.
(2009)
Classification of 2-dimensional cursor movement imagery EEG signals
On the use of convolutional neural networks and augmented CSP
Decoding human motor activity from EEG single trials for a discrete two-dimensional cursor control
J. Neural Eng.
Mu rhythm (de)synchronization and EEG single-trial classification of different motor imagery tasks
NeuroImage
Gradient-based learning applied
Proc. IEEE
Application of convolutional neural networks to four-class motor imagery classification problem
Inf. Technol. Control
Practical Bayesian Optimization of Machine Learning Algorithms
The application of Bayesian methods for seeking the extremum
Towards Glob. Optim.
EEG datasets for motor imagery brain computer interface
Gigascience
Review of the BCI competition IV
Front. Neurosci.
Cited by (96)
Interpretation of a deep analysis of speech imagery features extracted by a capsule neural network
2023, Computers in Biology and MedicineA 2D CNN-LSTM hybrid algorithm using time series segments of EEG data for motor imagery classification
2023, Biomedical Signal Processing and ControlA review of critical challenges in MI-BCI: From conventional to deep learning methods
2023, Journal of Neuroscience Methods