Elsevier

Expert Systems with Applications

Volume 106, 15 September 2018, Pages 169-182
Expert Systems with Applications

EEG signal classification using universum support vector machine

https://doi.org/10.1016/j.eswa.2018.03.053Get rights and content

Highlights

  • Electroencephalogram signal classification is performed using universum learning.

  • Support vector machine classifier uses prior information from interictal signals.

  • Many feature extraction techniques are used for comparing the algorithms.

  • Universum support vector machine is used first time for seizure classification.

Abstract

Support vector machine (SVM) has been used widely for classification of electroencephalogram (EEG) signals for the diagnosis of neurological disorders such as epilepsy and sleep disorders. SVM shows good generalization performance for high dimensional data due to its convex optimization problem. The incorporation of prior knowledge about the data leads to a better optimized classifier. Different types of EEG signals provide information about the distribution of EEG data. To include prior information in the classification of EEG signals, we propose a novel machine learning approach based on universum support vector machine (USVM) for classification. In our approach, the universum data points are generated by selecting universum from the EEG dataset itself which are the interictal EEG signals. This removes the effect of outliers on the generation of universum data. Further, to reduce the computation time, we use our approach of universum selection with universum twin support vector machine (UTSVM) which has less computational cost in comparison to traditional SVM. For checking the validity of our proposed methods, we use various feature extraction techniques for different datasets consisting of healthy and seizure signals. Several numerical experiments are performed on the generated datasets and the results of our proposed approach are compared with other baseline methods. Our proposed USVM and proposed UTSVM show better generalization performance compared to SVM, USVM, Twin SVM (TWSVM) and UTSVM. The proposed UTSVM has achieved highest classification accuracy of 99% for the healthy and seizure EEG signals.

Introduction

Electroencephalogram (EEG) signal classification is a major challenge in the field of machine learning and signal processing. EEG is widely used non-invasive technique for the detection of various types of brain disorders such as epileptic seizures and sleep disorders. In epilepsy, the extent of disease ranges from partial to generalized seizures which are reflected in their respective EEG. The different types of EEG signals are shown in Fig. 2. For the better feature extraction and classification of EEG signals, several signal processing techniques have been used by researchers. Among the various feature extraction techniques, wavelet transform is one of the frequently used methods. In wavelet transform, the frequency domain features are extracted from the signal with good localization in time which is in contrast to the Fourier transform where the signal analysis is done mainly in the frequency domain. In wavelet analysis, the approximation and decomposition coefficients are used to form the feature vector as shown in Fig. 3. The different families of wavelet are used for specific type of signals to get better characteristics of that signal. Adeli, Zhou, and Dadmehr (2003) proposed a computer aided diagnosis (CAD) method for epilepsy using discrete wavelet transform (DWT). They used Daubechies wavelet with db-4 as the mother wavelet for the feature extraction. Rosso et al. (2005) used orthogonal decimated discrete wavelet transform (ODWT) for detecting maturational changes associated with childhood absence epilepsy. Ocak (2008) performed the classification of EEG signals using wavelet packet analysis and genetic algorithm. Daubechies wavelet-2 is used for the classification of five different EEG signals (Guler & Ubeyli, 2005). Subasi and Gursoy (2010) used principal component analysis (PCA), linear discriminant analysis (LDA) and independent component analysis (ICA) for the feature extraction, and support vector machine (SVM) for classification.

The proper selection of classification techniques is very crucial for the automated diagnosis of patients having neurological diseases. Among the various classification algorithms, support vector machines (SVMs) (Cortes & Vapnik, 1995) have emerged as a powerful classification technique. SVM solves a convex optimization problem which leads to a globally optimal solution. This is in contrast to artificial neural network (ANN) that suffers from the problem of local minima. SVM also has a lower VC (Vapnik-Chervonenkis) dimension that enables it to classify high dimensional data with less optimizing parameters. Many researchers have used SVM in the classification of EEG signals (Ma et al., 2016) and for the diagnosis of neurological diseases like epilepsy (Liu et al., 2012, Nicolaou and Georgiou, 2012, Zavar et al., 2011). Guo et al. (2011) performed the classification of mental tasks from the analysis of EEG signals using SVM. Least squares support vector machine (LSSVM) (Suykens & Vandewalle, 1999) is used in (Bajaj and Pachori, 2012, Joshi et al., 2014, Li and Wen, 2009, July, Sharma and Pachori, 2015) for the detection of epilepsy. LSSVM is used for classification of EEG signal with a clustering based approach (Li & Wen, 2011). For multiclass classification of EEG signals, Guler and Ubeyli (2007) proposed a support vector machine based model and showed that SVM gives better classification accuracy for EEG signals as compared to probabilistic neural network (PNN) and multilayer perceptron neural network (MLPNN).

Weston, Collobert, Sinz, Bottou, and Vapnik (2006) proposed a universum support vector machine (USVM) to give prior information to the classifier about the distribution of data. The universum data points do not belong to any of the classes and lie within an ε − insensitive tube between the two classes. This approach is also called as ‘learning through contradiction’ . In USVM, along with the hinge loss it involves an ε − insensitive loss function. This universum based approach has been applied to various real world applications. Long, Tang, and Tian (2016) performed the classification of investor sentiments using universum support vector machine. Gao, Tian, Shao, and Deng (2008) used universum SVM for prediction of translation initiation in proteins. They used two approaches for selecting the universum: one is based on uniform distribution of noise and other using random averaging of the data points. Hao and Zhang (2013) proposed an ensemble universum support vector machine for the detection of Alzheimer's disease from brain imaging data by using the patients with mild cognitive impairment (MCI) as the universum. Text classification is also performed using universum data (Liu, Hsaio, Lee, Chang, & Kuo, 2016).

The major challenge with universum based approach is the proper selection of universum data points. In Weston et al. (2006), the universum data is selected based on similarity of digits in digit classification. For example, digit ‘3’ is chosen as universum for classifying ‘5’ and ‘8’ since its shape is similar to both ‘5’ and ‘8’. Chapelle, Agarwal, Sinz, and Schölkopf (2008) presented an analysis for the selection of proper universum data. In (Bai & Cherkassky, 2008), universum samples are generated for classification of faces using the random averaging approach where the average of the pixels of two faces is used as the universum. In (Chen & Zhang, 2009), an in-between-universum (IBU) approach is proposed for the proper selection of universum. The practical conditions for choosing the universum data are given in (Cherkassky and Dai, 2009, Cherkassky et al., 2011). In the recent decade some nonparallel SVMs such as generalized eigenvalue proximal support vector machine (GEPSVM) (Mangasarian & Wild, 2006) and twin support vector machine (TWSVM) (Jayadeva, Khemchandani, & Chandra, 2007) are proposed to reduce the computational complexity of standard SVM. Inspired by the work of TWSVM, some scholars proposed variants of TWSVM (Khemchandani et al., 2016, Kumar and Gopal, 2009, Qi et al., 2013, Shao et al., 2011, Tanveer, 2015a, Tanveer, 2015b, Tanveer et al., 2016, Wang et al., 2015, Xu et al., 2017) to improve the performance and reduce the computational complexity of TWSVM. TWSVM is used for the first time in this work for the classification of seizure EEG signals. Qi, Tian, and Shi (2012) proposed a universum twin support vector machine (UTSVM) to reduce the computational complexity of USVM and used the random averaging approach for universum selection. Xu, Chen, and Li (2016) also used the random averaging scheme for selecting the universum data. Since the random averaging approach suffers from the effect of outliers, the method of generation of universum data depends solely on the type of application and is currently an area of research.

Motivated by the work on universum support vector machine in (Gao et al., 2008, Hao and Zhang, 2013, Long et al., 2016), we propose a novel approach of selecting the universum in the classification of EEG signals for seizure detection. Since universum based support vector machines have not been used for the classification of EEG signals, we also present an application of USVM and UTSVM for EEG signals. For the classification of EEG signals in the healthy and seizure (ictal) classes, the interictal EEG signals are chosen as the universum which corresponds to the EEG recording for the time period in between the seizures in a patient with epilepsy. Our approach of EEG classification is tested for different datasets that are generated using various feature extraction techniques, and the results are compared with other existing methods.

In this work, all vectors are taken as column vectors. The inner product of two vectors is represented by: atb where a and b are the vectors of n − dimensional real space Rn, and at is the transpose of a. ||a|| and ||G|| represent the 2-norm of a vector a and a matrix G respectively. e denotes the vector of ones of dimension m. I represents the identity matrix of appropriate size.

The rest of this paper is organized as follows: Section 2 discusses the formulations of USVM and UTSVM. Section 3 elaborates our proposed approach of USVM and UTSVM. Several numerical experiments are performed on the datasets generated from EEG signals using different feature extraction techniques for the discussed and proposed approach in Section 4. Finally, Section 5 gives the conclusions and possible future directions.

Section snippets

Related work

In this section, we briefly review USVM and UTSVM. For detailed description, the interested readers are referred to (Qi et al., 2012, Weston et al., 2006).

Proposed approach

In many of the classification approaches for EEG signals, the prior information about the distribution of EEG data is not used. Due to this, the classification techniques are not able to give better generalization performance even if the most efficient feature extraction technique is used. The universum based approach actually gives some prior information in the construction of the classifier. So we used a universum based approach with support vector machine to classify the EEG signals.

Numerical experiments

In this section, numerical experiments are performed for the classification of EEG signals of healthy state and seizure. The EEG dataset is taken from (Andrzejak et al., 2001) which is available online. The dataset consists of five sets viz. Z, O, N, F and S. Each set contains 100 single-channel EEG signals sampled at a sampling rate of 173.61 Hz and of 23.6 s duration. The sets Z and O are surface EEG recordings of five healthy volunteers with eyes open and closed respectively. The sets N and

Conclusions

On the basis of the experimental results, it can be stated that our universum based approach gives better generalization performance for the classification of EEG signals as compared to the existing approaches. Our method of selection of universum points has proved to be a promising approach for the classification of healthy and seizure EEG signals. Also, the effect of outliers on the universum is reduced by using the universum from the EEG dataset itself i.e., the seizure free EEG signal. The

Acknowledgements

This work was supported by Science and Engineering Research Board (SERB) as Early Career Research Award grant no. ECR/2017/000053 and Department of Science and Technology as Ramanujan fellowship grant no. SB/S2/RJN-001/2016. We gratefully acknowledge the Indian Institute of Technology Indore for providing facilities and support. We are thankful to the Ministry of Human Resource Development (MHRD), Govt. of India for providing Teaching Assistantship (TA) fellowship to Mr. Bharat Richhariya.

References (47)

  • A. Subasi et al.

    EEG signal classification using PCA, ICA, LDA and support vector machines

    Expert Systems with Applications

    (2010)
  • M. Zavar et al.

    Evolutionary model selection in a wavelet-based support vector machine for automated seizure detection

    Expert Systems with Applications

    (2011)
  • R.G. Andrzejak et al.

    Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state

    Physical Review E

    (2001)
  • X. Bai et al.

    Gender classification of human faces using inference through contradictions

  • V. Bajaj et al.

    Classification of seizure and nonseizure EEG signals using empirical mode decomposition

    IEEE Transactions on Information Technology in Biomedicine

    (2012)
  • M.S. Bartlett et al.

    Face recognition by independent component analysis

    IEEE Transactions on Neural Networks

    (2002)
  • O. Chapelle et al.

    An analysis of inference with the universum

  • S. Chen et al.

    Selecting informative Universum sample for semi-supervised learning

  • V. Cherkassky et al.

    Empirical study of the Universum SVM learning for high-dimensional data

  • V. Cherkassky et al.

    Practical conditions for effectiveness of the universum learning

    IEEE Transactions on Neural Networks

    (2011)
  • C. Cortes et al.

    Support-vector networks

    Machine Learning

    (1995)
  • J. Demšar

    Statistical comparisons of classifiers over multiple data sets

    Journal of Machine Learning Research

    (2006)
  • T. Gao et al.

    Accurate prediction of translation initiation sites by Universum SVM

  • Cited by (205)

    • Multi-task twin support vector machine with Universum data

      2024, Engineering Applications of Artificial Intelligence
    • Primal dual algorithm for solving the nonsmooth Twin SVM

      2024, Engineering Applications of Artificial Intelligence
    • BCI-AMSH: A MATLAB based open-source brain computer interface assistive application for mental stress healing

      2023, e-Prime - Advances in Electrical Engineering, Electronics and Energy
    • Universum twin support vector machine with truncated pinball loss

      2023, Engineering Applications of Artificial Intelligence
    View all citing articles on Scopus
    View full text