Elsevier

Applied Soft Computing

Volume 75, February 2019, Pages 461-472
Applied Soft Computing

Classification of multiple motor imagery using deep convolutional neural networks and spatial filters

https://doi.org/10.1016/j.asoc.2018.11.031Get rights and content

Highlights

  • Solution for a multi-class motor imagery classification problem.

  • Optimization of the Convolutional Neural Networks by Bayesian optimization.

  • Development of a frequency bands selection algorithm.

  • Generation of a multi-motor imagery dataset including an idle state.

Abstract

Brain–Computer Interfaces (BCI) are systems that translate brain activity patterns into commands for an interactive application, and some of them recognize patterns generated by motor imagery. Currently, these systems present performances and methodologies that still are not practical enough for realistic applications. Therefore, this paper proposes two methodologies for multiple motor imagery classification. Both methodologies use features extracted by a variant of Discriminative Filter Bank Common Spatial Pattern (DFBCSP) presented in this paper. The frequency bands selection in this variant is carried out by a novel iterative algorithm that selects the frequency band that attains the highest classification accuracy for specific binary classification. For each binary combination of classes, a frequency band is selected. The resulting samples are then set into a matrix which feeds one or many Convolutional Neural Networks previously optimized by using a Bayesian optimization. The first methodology applies a Convolutional Neural Network (CNN) for the classification of all classes and the second is a modular network composed of four expert CNNs. In this modular network, each expert CNN performs a binary classification, and a fully connected network analyzes their results. To validate both approaches two datasets were used, the BCI competition IV dataset 2a and another presented in this paper recorded from eight subjects by using the OpenBCI device. The experimental results demonstrated an improvement in the classification accuracy over many classic intelligent recognition methods, without a high computation time in order that they can be implemented in an online application.

Introduction

Research in the development of new human–computer interaction systems is being continued, among them are the brain–computer interfaces (BCI). These systems allow users to communicate or manipulate external devices by using only electrophysiological measures of their brain activity [1]. The flow of electric currents causes this electric brain activity during synaptic excitation of the dendrites in the neurons. Commonly non-invasive recording procedures are used since they are safer for the subjects, more comfortable to implement and can be used by most of the population, including people with physical disabilities [2], [3], [4]. The signals acquired by this procedure are known as electroencephalographic signals (EEG). From the EEG signals, several brain activity patterns can be detected which allow determining the user intentions, one of those are the patterns generated by motor imagery. Motor imagery (MI) is when the user imagines the execution of certain physical activity such as open and close a hand or foot movement. When motor imagery occurs, there is energy suppression of the Mu (8–12 Hz) and Beta (16–24 Hz) rhythms, phenomena which is known as event-related desynchronization (ERD). At the end of the motor imagery execution, an energy enhancement of the rhythms occurs, this event is called event-related synchronization (ERS).

In the case of multiple motor imagery classification, several methods have been proposed. Among the most successful ones is the generation of spatial filters by using the Common Spatial Pattern (CSP) algorithm. These filters are used to extract discriminative spatial patterns that contrast the energy features in different motor imageries [5]. By applying the spatial filters, the EEG signals are projected onto a subspace where they present higher discrimination between the two motor imageries involved [5]. From this same algorithm, variants have emerged with the aim of increasing the classification performance such as Filter Bank Common Spatial Pattern (FBCSP) [6] and Discriminative Filter Bank Common Spatial Pattern (DFBCSP) [7]. These variants have shown that applying CSP at different frequency bands improves the classification accuracy since these methods are capable of learning subject-specific patterns from the high-dimensional EEG measurements. The most effective variant has been the DFBCSP since it can identify the most discriminative frequency bands according to the motor imagery and subject, making this method a more robust approach for the classification of non-stationary motor imagery EEG signals, unlike FBCSP were all the frequency bands are fixed for all subjects. These feature extraction methods are typically used in conventional classification methodologies along with algorithms such as Linear Discriminant Analysis (LDA) [8], artificial neural networks (ANN) [9] or fuzzy systems [10]. However, these conventional methods like others [11], [12], present systems with average accuracies below the 80% and most of them solve only binary classification problems. Therefore, the development of methodologies with better performance is still-ongoing.

The existent feature extraction methods for MI-BCI mainly focus on the extraction of static energy features, neglecting the dynamics of the signal during the execution of motor imagery; when static energy features are extracted, the energy dynamics is reduced into a single number and hence, temporal information is ignored. Pfurtscheller et al. [13] have proved that foot and tongue motor imageries have a more tenuous drop in energy than the left or right-hand motor imagery, they are more likely to produce an energy pattern in certain channels and frequencies. Therefore, it is necessary to develop a methodology that can extract temporal information and collaborate with static energy features to build a classifier which can then handle a wider variety of motor imageries; one possible and relatively new approach might be found in the deep learning branch of machine learning. These algorithms have the potential for detecting features in a wider range of data that had been overlooked. The algorithm that has reached promising results in other areas is the Convolutional Neural Network (CNN), originally proposed by Y. LeCun et al. [14]. Consequently, this paper evaluates the performance of CNNs to detect dynamic energy-based features. Convolution can detect patterns basing on a learned kernel and can be used as a method to detect events in signal processing. CNNs have been previously used for classification of EEG signals and have achieved better performance than shallow classification algorithms. Among the works aimed to classify different motor imageries, there is the one done by Uktveris, et al. [15], where a classification of four-class motor imagery was carried out by employing diverse feature extraction methods, achieving an average accuracy of 68%. Another work found for the classification of four-class motor imagery was the one presented by Yang et al. [9], his system extracted features using an augmented CSP algorithm and attained an average accuracy of 68.45%. For the solution of binary classification problems, Tang et al. [16] achieved an average accuracy of 86.41% by using a five-layer CNN for the classification between hand motor imageries. Despite all these works, it is still needed more research on developing multi-class BCI with better performance and with online capabilities for real-world applications. Therefore, this paper presents two new methods for the classification of multiple motor imageries and an idle state (a state without any motor imagery) applying CNNs. To classify these mental states both methods employ features extracted using CSP in different frequencies bands, following an improved method of DFBCSP presented in this paper, where the bandpass selection procedure determines the best frequency bands considering the mental states that will be discriminated. In this paper a monolithic method using a single CNN and a modular method using four expert CNNs and a fully connected network are implemented, both for the classification of multiple classes (MI of the left-hand, right-hand, left-foot, right-foot, tongue and an idle state), this with the aim to analyze the performance of both methods.

Because the information to design an optimal CNN structure is scarce, the selection of some structure and training hyperparameters was done by applying a Bayesian optimization. Snoek et al. [17] demonstrated that the Bayesian optimization surpasses a human expert at selecting hyperparameters and as a result beat the hyperparameters selection procedures by over 3%. The optimized hyperparameters are the number of convolutional layers, the initial learning rate, the stochastic gradient descent momentum and the L2 regularization strength. The Bayesian optimization [18] typically works assuming that the unknown function was sampled from a Gaussian process (GP) and the posterior distribution from this function is computed in each observation performed [17]. The GP is a probability function over a random function, which can have an infinite set of variables in such a way that any finite random variables subset has a multivariable Gaussian distribution [19]. This work uses the observations with different hyperparameters configurations as a general performance measure. The selection of the hyperparameters for the next experiment is carried out by minimizing the classification error on the validation set.

Most of the datasets available have only two types of motor imagery EEG, commonly hands movements or only one-foot motor imagery, therefore to evaluate the performance of the proposed methods with more than two motor imagery EEG the BCI competition IV dataset 2a was used. This dataset was chosen since the primary objective of this paper is to decode multi-class motor imagery EEG and because it is one of the most used datasets in the scientific literature. To complement the evaluation of this study a new dataset was recorded from eight subjects using the OpenBCI device with a set of 8 electrodes, this with the aim of facilitating the use of MI-BCI that commonly requires more than 20 electrodes for a multi-class classification [20], [21]. Furthermore, the class of the idle state is commonly overlooked by the existing BCI; therefore in this work, it was added as a class in the new dataset since it is essential to recognize it if the proposed methods are intended to be implemented in a realistic application.

In summary, the original contributions of the paper are: evaluation of two CNN methods (monolithic and modular architecture) for multiple motor imagery classification, optimization of CNNs by Bayesian optimization, development of a frequency band selection algorithm, and generation of a multi-motor imagery dataset that includes an idle state. The paper is organized in the following sections. Section 2 describes the datasets used. The general scheme of the BCI designed is explained in Section 3. Section 4 presents the pre-processing and implementation of CSP. In Section 5 the structure of the input matrices for the CNNs is explained as well as the Bayesian optimization, it also presents the resulting optimized hyperparameters. The performance of the proposed methods is shown in Section 6. Finally, the conclusions are discussed in Section 7.

Section snippets

Datasets

The datasets used for this work were the BCI competition IV dataset 2a and another acquired by using the OpenBCI device. The dataset 2a [21] consist of four different motor imagery tasks: left hand, right hand, both feet, and tongue movements. Nine subjects executed the trials for one training session and one test session; there are in total 288 trials per session, 72 trials per class. These trials were executed with authorization of the ethics committee of our institution under authorization

General scheme

The design of the systems proposed is divided into four stages, the first consist of the selection of the most discriminant frequency bands for the classification of each binary class combination (one frequency band for each combination). The next stage presents the design of the spatial filters. In the third stage, a Bayesian optimization is applied to find the optimal hyperparameters for the structure and training of the convolutional neural networks. Finally, the last stage illustrates the

Frequency bands selection

The amplitude modulation in the sensorimotor rhythms can be detected between the Mu and Beta bands. The frequency band range of Mu and Beta varies in the literature. For example, Lotte et al. [1] considers Mu and Beta as 8–12 Hz and 16–24 Hz, respectively, however, Pfurtscheller et al. [12] established the bands Mu 10–12 Hz and Beta 14–18 Hz, and Chai et al. [27] Mu 8–13 Hz and Beta 14–30 Hz. Even if a frequency range among the previously mentioned is selected, there exist sub-bands that give

Implementation of convolutional neural networks

For the solution of the multi-class classification problem, a monolithic network (a single CNN) and a modular network were designed. The modular network was composed of four expert CNNs, each one classifies one of the following binary combinations, for the dataset 2a: hands vs. feet, hands vs. tongue, left hand vs. right hand, and feet vs. tongue; and for the OpenBCI dataset: motor imagery vs. idle, hands vs. feet, left hand vs. right hand, and left foot vs. right foot. In this section, the

Results and Discussion

In this section, the classification performance of the monolithic and modular network is presented. The performances of the four expert CNNs that composed the modular network are also presented in order to carry out an analysis on how well they classify their corresponding classes.

For the BCI Competition IV dataset 2a, both methods were trained with the training session and tested with the testing session separately for each subject. As for the OpenBCI dataset, the evaluation was carried out by

Conclusions

In this work, a monolithic network and modular network were designed for the multiclass classification of motor imagery EEG signals. The classes involved are several motor imageries and a mental state where the subject does not execute any motor imagery. Both approaches used optimized convolutional neural networks to classify and extract information from matrices composed of features obtained by using a variant of DFBCSP, processing the data in both spectral and spatial domain. The frequency

Acknowledgment

The authors wish to extend their thanks to Tecnológico Nacional de México for the support provided to carry out this work under the project 5684.16-P.

References (44)

  • AydemirÖ.

    Classification of 2-dimensional cursor movement imagery EEG signals

  • YangH. et al.

    On the use of convolutional neural networks and augmented CSP

  • HuangD. et al.

    Decoding human motor activity from EEG single trials for a discrete two-dimensional cursor control

    J. Neural Eng.

    (2009)
  • PfurtschellerG. et al.

    Mu rhythm (de)synchronization and EEG single-trial classification of different motor imagery tasks

    NeuroImage

    (2006)
  • LeCunY.

    Gradient-based learning applied

    Proc. IEEE

    (1998)
  • UktverisT.

    Application of convolutional neural networks to four-class motor imagery classification problem

    Inf. Technol. Control

    (2017)
  • SnoekJ. et al.

    Practical Bayesian Optimization of Machine Learning Algorithms

    (2012)
  • MockusJ. et al.

    The application of Bayesian methods for seeking the extremum

    Towards Glob. Optim.

    (1978)
  • M.A. Gelbart, J. Snoek, R.P. Adams, Bayesian Optimization with Unknown Constraints, 22 Mayo 2014. [Online]. Available:...
  • ChoH. et al.

    EEG datasets for motor imagery brain computer interface

    Gigascience

    (2017)
  • TangermannM.

    Review of the BCI competition IV

    Front. Neurosci.

    (2012)
  • C. Brunner, R. Leeb, G.R. Muller-Putz, A. Schlogl, G. Pfurtscheller, BCI Competition 2008–Graz data set A, 2008....
  • Cited by (96)

    View all citing articles on Scopus
    View full text