Elsevier

Ad Hoc Networks

Volume 104, 1 July 2020, 102178
Ad Hoc Networks

OCRNN: An orthogonal constrained recurrent neural network for sleep analysis based on EEG data

https://doi.org/10.1016/j.adhoc.2020.102178Get rights and content

Abstract

This paper introduced an end-to-end mixed deep learning model for automatic sleep analysis based on the EEG signal. Unlike some existing machine learning models for EEG analysis, we did not rely on any hand-crafted feature engineering and elaborate pipeline design. Furthermore, apart from some existing deep learning frameworks based on some off-the-self existing modules, we introduced the orthogonal constrained recurrent neural network (OCRNN) as the downstream module after the spatial-temporal expansion and representation provided by the one dimensional convolutional neural networks. We evaluated our model using the EEG-based sleep datasets for sleep stage scoring. We compared the performances of four types of RNN frameworks, where three of them are OCRNNs. The results show that OCRNN can achieve competitive better F1 score, accuracy and AUC score compared to the previous baseline results. Moreover, our model (orRNN and pdRNN) can achieve the above results with less number of parameters and less number of training epochs, which demonstrate its potential usage to launch approximate real-time medical diagnosis.

Introduction

Electroencephalograph (EEG) is a complex signal that contains information from different frequency bands and different channels (multiple probes for different brain functional areas). This multi-modality signal attracts researchers to find approaches to disclose the underlying information so that we can leverage the information to solve problems such as epilepsy, sleep, brain computer interfacing, and cognitive monitoring [1]. The challenging task of making classification based on EEG signals lies on the fact that EEG is a non-stationary and low signal-to-noise ratio (SNR) signal, so that some single hand-crafted signal processing approach meets the bottleneck [2]. EEG has the high temporal resolution due to the high-speed propagation of the electric field but EEG has a low spatial resolution so that the signals are highly correlated. This increases the difficulty to decompose and expand the details of the signal.

Thanks to the development of the modern computing power to trigger the potential of data science and artificial intelligence, researchers seek to handle these problems with more powerful and automatic toolboxes to understand the EEG signal. One of the most adopted off-the-self strategies is as follows. The Component analysis such as PCA, ICA and neighborhood component analysis [3], [4], [5] serve as the pre-processing steps for the downstream steps like SVM, k-means [6], [7], [8]. However, this kind of framework belong to the traditional hand-crafted machine learning algorithms which highly rely on the domain knowledge and they may be subject to the limitation of the applicability and flexibility. Deep learning, as one categories of the machine learning families, provide the hierarchical representations of input data complicate non-linear activation and flexible connections. Since the proven performance of the deep learning in areas like images, videos, acoustic signal, natural language processing, we witness the rise of the interest for deployment of the deep learning model in EEG understanding as well.

Among the deep learning models, recurrent neural network, as a naturally time-series based deep learning model, mostly will serve as the first-try model for EEG related tasks. For instance, a special kind of RNN named echo state network was utilized to classify the REM Behavior Disorder (RBD) from 118 subjects (with healthy control) based on the collected EEG signals [9]. A more complex multi-task RNN framework was proposed for motion intention recognition based on features extracted from segregated EEG signals [10]. An end-to-end RNN learning framework was designed to learn the EEG signal after removal of the artifact for the movement related cortical potentials (MRCP) in order to recognize lane change decision making [11]. More RNN based on system-level application demonstrate the suitability of RNN in broader applications such as autonomous whole-arm exoskeleton control [12] and decoding of motor imagery movements [13].

However, according to the statistics by Roy et al. [14], around 40% of the studies used convolutional neural networks (CNNs), while 13% used recurrent neural networks (RNNs). The previous studies trained their neural networks more on preprocessed EEG signals. The benefit of the mixed model is that it can leverage the advantages of each submodule of the neural network while also minimize the incompatibility. Some previous studies have demonstrated the mixed model for raw time-series data in other applications such as soil moisture retrieval [15], [16] and signal detection [17], [18], especially the mixed model or different neural network, such as CNN-RNN mixed framework [19]. In terms of the EEG-related mixed model, most of the models focus on the end-to-end downstream pipeline framework with hand-crafted pre-processing techniques like short-time Fourier transform (STFT), mel-frequency cepstral coefficient (MFCC) and low-pass filter (LPF). These models still will be constrained by physical limitations due to the preprocessing steps so that we seek to more automatic and flexible framework. A one dimensional CNN plus bidirectional long short term memory (LSTM) model is proposed as the fully mixed deep learning framework [20], but we do not find too much related examples like this.

In this paper, we propose 1dCNN-OCRNN model, an one dimensional CNN followed by the orthogonal constrained RNN for the EEG signal based sleep stage scoring task. The highlights of our work are:

  • We use the 1dCNN to generate the multi channel feature maps to achieve the similar function as the time-frequency function, but we reduced the complexity of model by adjusting the output and eliminate repeated convolution layer with fewer channels.

  • We design our own orthogonal constraints based RNNs with its related coordinate-descent like iteration algorithm for training. These type of RNNs can mitigate the long-existing vanishing and exploding problems by leveraging the self-adjusted weights. Therefore, important information from the EEG multi-channel feature maps are stored in the ”memory” of the RNN.

  • We demonstrate the results use the real-world sleep staging classification dataset based on EEG signals. We provide solid comparison of the performance of the three OCRNNs and the standard LSTM. Compared to the early results provided in [20], we achieve competitive slightly better results using approximate 1/6 total number of parameters and 1/3 of training epochs. Therefore, by saving the computation cost in both spatial and temporal, our model has the potential to be broadly deployed in real scenario.

The rest of the paper is organized as follows. In Section 2, the preliminaries of the OCRNNs and its iterative training algorithms are introduced. In Section 3, the 1dCNN-OCRNN model is introduced and we compared different combinations of the mixture model. In Section 4, the experiment of the EEG sleep stage analysis is introduced and the solid performance comparison is analyzed and discussed. Finally, we conclude in Section 5.

Section snippets

Preliminaries of the OCRNN

RNN is broadly utilized to solve time-series or text-based problems such as acoustic signal prediction and processing, machine translation, music generation, natural language processing. Classical RNN module rely on the gated-control mechanism which utilizes the gate module to choose the information to keep and drop, in order to avoid the long-existing vanishing gradients and explosion problems. This kind of frameworks still stands for the baseline and first-try framework when deploys the deep

Frameworks of the 1dCNN-OCRNN mixture model

Inspired by the work from Wang et al. [19], Supratak et al. [26], we consider to utilize the 1dCNN to obtain the time-frequency feature map. Unlike traditional signal-processing based methods such as short time frequency transform (STFT), mel-frequency cepstral coefficient (MFCC) and low-pass filter (LPF), we do not need to calculate or separate specific frequency component by hand, but resort to the 1dCNN to automatically learn and iterate the multi-channel feature maps in order to achieve the

Experiment and performance analysis

To validate the effectiveness of the decomposition and iteration algorithm of OCRNN, as well as the effectiveness of the mixed model, we choose the sleep-EDF dataset [20], [26] which is essentially the transformation of the partial of the sleep-EDF dataset [27], [28]. The original sleep-EDF database contains 197 whole-night polysomnographic sleep recordings with EEG, EOG, chin EMG, and event markers. The version we utilize here comes from the link: //www.kaggle.com/phhasian0710/deepsleepnet-201708

Conclusion

In this paper, we proposed a 1dCNN-OCRNN mixed framework to solve the sleep stage scoring problem based on the EEG data. The 1dCNN is adopted to automatically extract the time-frequency feature map. It provides two levels of information (the fine and coarse) in the feature map and they are merged as the input to the OCRNN. As for the OCRNN, we leverage the benefits of orthogonal constraints on the recurrent weight matrix and design the related coordinate-descent like algorithms. We demonstrate

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fangqi Zhu received the B.Sc and M.Sc degrees in electrical engineering from the University of Electronic Science and Technology of China, Chengdu, in 2013 and 2016, respectively. Currently he is working towards the Ph.D. degree in Electrical Engineering from the University of Texas at Arlington. His current research interests include structured deep learning, statistical inference and optimization, generative anomaly detection, signal processing and computational intelligence. He is the

References (29)

  • M.-A. Moinnereau et al.

    EEG artifact removal for improved automated lane change detection while driving

    2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC)

    (2018)
  • S. Crea et al.

    Feasibility and safety of shared EEG/EOG and vision-guided autonomous whole-arm exoskeleton control to perform activities of daily living

    Sci. Rep.

    (2018)
  • Z. Tayeb et al.

    Validating deep neural networks for online decoding of motor imagery movements from EEG signals

    Sensors

    (2019)
  • Y. Roy et al.

    Deep learning-based electroencephalography analysis: a systematic review

    J. Neural Eng.

    (2019)
  • Cited by (10)

    • Deep learning-driven opportunistic spectrum access (OSA) framework for cognitive 5G and beyond 5G (B5G) networks

      2021, Ad Hoc Networks
      Citation Excerpt :

      With the wide range of practical applications of the electromagnetic spectrum, such as communications, radio broadcasting [1], medical imaging [2–4], sensing [5], navigation and other fields, competition for limited spectrum resources is becoming increasingly fierce [6].

    • PLSleepNet: A Single Channel EEG Sleep Staging Method Based on Feature Pyramid and Bidirectional LSTM

      2023, 2023 8th International Conference on Intelligent Computing and Signal Processing, ICSP 2023
    View all citing articles on Scopus

    Fangqi Zhu received the B.Sc and M.Sc degrees in electrical engineering from the University of Electronic Science and Technology of China, Chengdu, in 2013 and 2016, respectively. Currently he is working towards the Ph.D. degree in Electrical Engineering from the University of Texas at Arlington. His current research interests include structured deep learning, statistical inference and optimization, generative anomaly detection, signal processing and computational intelligence. He is the reviewer of several conferences and journals.

    Qilian Liang (M’01-SM’05-F’16) is a distinguished university professor at the Department of Electrical Engineering, University of Texas at Arlington (UTA). He received his PhD degree from University of Southern California in Electrical Engineering in 2000. Prior to joining the faculty of UTA in 2002, he was a Member of Technical Staff in Hughes Network Systems Inc at San Diego, California. His research interests include wireless sensor networks, wireless communications, signal processing, computational intelligence. Dr. Liang has published more than 300 journal and conference papers. He received 2002 IEEE Transactions on Fuzzy Systems Outstanding Paper Award, 2003 U.S. Office of Naval Research Young Investigator Award, 2007, 2009, 2010 U.S. Air Force Summer Faculty Fellowship Program Award, 2012 UTA College of Engineering Excellence in Research Award, 2013 UTA Outstanding Research Achievement Award, and was inducted into UTA Academy of Distinguished Scholars in 2015. Dr. Liang is a Fellow of the IEEE.

    View full text