Classification of snoring sound based on a recurrent neural network
Introduction
Lack of sleep induces chronic fatigue, lethargy, and daytime sleepiness and has a considerable impact on social activities. Snoring is one of the main causes of sleep disruption and it is a breathing interference that occurs as the upper airways close during sleep or when breathing becomes blocked due to restricted airways. Snoring is a sleep disorder that influences a large part of the population, and over 60% of adult men and 44% of adult women over the age of 40 are known to snore (Dalmasso and Prota, 1996, Duckitt et al., 2006, Lugaresi et al., 1980). Moreover, snoring is not only a symptom that results from sleep disruption, but it may give rise to serious diseases, such as sleep apnea or ischemic cerebral diseases (Cavusoglu et al., 2007, Dalmasso and Prota, 1996, Duckitt et al., 2006, Lugaresi et al., 1980, Norton and Dunn, 1985, Wilkin, 1985).
Polysomnography that measures various biological variables through special equipment is the most common way to evaluate the severity of a snoring disease (Duckitt et al., 2006). However, polysomnography not only requires significant time and cost, but it is difficult to measure the same signals that the subject exhibits on a daily basis since they are measured while sleeping in an unfamiliar environment. Further, it is difficult for the test administrator to manually test snoring signals that are measured over a long period of time (Osborne et al., 1999, Osman et al., 1998). Therefore, various sleep-related applications are now being developed to measure snoring at home, and there is increased interest in snoring prevention pillows and other home care products (Bhat et al., 2015, Cistulli and Grunstein, 2005). Still, measuring snoring remains a difficult task. Electrocardiography (ECG), pulse, respiration, and other such biological signals typically have periodicity and frequency aside from special cases. However, snoring is relatively irregular, and every individual snores at a different frequency. In addition, snoring has a large range from small sounds that resemble breathing to extremely loud sounds (> 95 dB) (Emoto, Abeyratne, & Kawano et al., 2018), and the length of snoring signals also differs for each person. Since everyone has different features, developing a generalized snoring detection algorithm is a considerable challenge.
Numerous studies have been conducted to classify and detect snoring. In recent studies, Hwang et al. (2015) classified snoring signals not based on audio, but based on a support vector machine using sleep data that were stored through a contactless sensor. Ng, San Koh, Puvanendran, and Abeyratne (2008) classified snoring signals through a level—correlation - dependent threshold and a translation - invariant discrete wavelet transform. Cavusoglu et al. (2007) used various feature vectors of snoring signals to classify them through a linear regression from a two-dimensional plane of the principle component algorithm. Duckitt et al. (2006) extracted voice signal features through Mel-frequency cepstral coefficients (MFCC) and used the hidden Markov mode to classify snoring signals. Snoring sounds were also classified by applying short-time energy and zero-crossing rate (ZCR) (Abeyratne et al., 2005, Cavusoglu et al., 2007, Fiz et al., 1996). Other studies have extracted even more features and snoring from respiratory sounds through an unsupervised learning algorithm (Azarbarzin & Moussavi, 2011).
Since the features of snoring signals are different for every subject, it is very difficult to detect snoring episodes using a threshold. However, since every individual also has snoring episode lengths, periods, and frequencies that are relatively consistent, if these features are learned and used in a customized algorithm, this will increase the efficiency of detecting snoring episodes.
Therefore, this study developed an automatic classifier based on deep learning that can automatically classify snoring signals and non-snoring signals from measured signals. The ZCR, Short-Time Fourier Transform (STFT), and MFCC, which are frequently used in voice signal processing, were utilized to extract features from measured signals. The extracted features were manually labeled (annotated), then used as the training set, validation set, and test set for the recurrent neural network (RNN). The performance evaluation of the proposed deep learning-based classifier was verified using statistical values (sensitivity, specificity, precision, and accuracy) and the F1-score.
Section snippets
The overall structure of the proposed method
The snoring classification system proposed in this study can be structured into feature extraction and classification, and the overall structure of this study is shown in Fig. 1. During the feature extraction stage, after signals recorded during sleep were divided into snoring episodes (SE) and non-snoring episodes (NSE), as shown in Fig. 1(a), the ZCR, STFT, and MFCC were used to extract features as shown in Fig. 1(b). In the next stage, as shown in Fig. 1(c), the features that were extracted
Dataset
Table 2 shows the configuration of the episodes used as the input vector of RNN. Eight thousand labeled episodes for the input vector of RNN consisted of 5600 SE and 2400 NSE. As described above, the NSE is a sound excluding the snoring, and it consists of voice, deep breaths, music, silence, and others (TV, footsteps, water running, etc.). In the case of voice and music, the noise was recorded by artificially executing music or by reading a passage.
Feature extraction and representation
Feature extraction for SE and NSE took place
Conclusions
In this study, features that were extracted from SE and NSE were used to develop an automatic classifier based on an RNN with outstanding classification performance. ZCR, STFT, and MFCC, which are often used to extract features from voiced signals, were used to extract features related to the frequency of recorded snoring signals and non-snoring signals (voice, movement, door closing sound, and others) and used as the RNN's input vector to improve the classifier's classification accuracy.
Since
Declarations of interest
None.
Disclosures
All authors have approved the final article.
Acknowledgment
This work was supported by the Korea Institute of Industrial Technology [EO180016]. The funding source has no role in study design; in the collection, analysis and interpretation of data; in the writing of the report; and in the decision to submit the article for publication.
References (44)
- et al.
Detection of sleep breathing sound based on artificial neural network analysis
Biomedical Signal Processing and Control
(2018) - et al.
Recurrent neural network based prediction of epileptic seizures in intra- and extracranial EEG
Neurocomputing
(2000) - et al.
Recurrent neural network-based approach for early recognition of Alzheimer's disease in EEG
Clinical Neurophysiology
(2001) - et al.
Intracranial pressure model in intensive care unit using a simple recurrent neural network through time
Neurocomputing
(2004) Recurrent neural networks with composite features for detection of electrocardiographic changes in partial epileptic patients
Computers in Biology and Medicine
(2008)- et al.
Automatic breath and snore sounds classification from tracheal and ambient sounds recordings
Medical Engineering & Physics
(2010) - et al.
Pitch jump probability measures for the analysis of snoring sounds in apnea
Physiological Measurement
(2005) - et al.
Automatic and unsupervised snore sound extraction from respiratory sound signals
IEEE Transactions on Biomedical Engineering
(2011) - et al.
Separation of voiced and unvoiced using zero crossing, rate and energy of the speech signal
- et al.
Ganong's review of medical physiology
(2015)
Is there a clinical role for smartphone sleep apps? comparison of sleep cycle detection by a smartphone application to polysomnography
Journal of clinical sleep medicine
An efficient method for snore/nonsnore classification of sleep sounds
Physiological Measurement
Medical devices for the diagnosis and treatment of obstructive sleep apnea
Expert Review of Medical Devices
Snoring: Analysis, measurement, clinical implications and applications
European Respiratory Journal
Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences
IEEE Transactions on Acoustics, Speech, and Signal Processing
Automatic detection, segmentation and assessment of snoring from ambient acoustic data
Physiological Measurement
Artificial neural networks for breathing and snoring episode detection in sleep sounds
Physiological Measurement
Acoustic analysis of snoring sound in patients with simple snoring and obstructive sleep apnoea
European Respiratory Journal
An expert system for detection of electrocardiographic changes in patients with partial epilepsy using wavelet based neural networks
Expert Systems
Long short-term memory
Neural Computation
Polyvinylidene fluoride sensor-based method for unconstrained snoring detection
Physiological Measurement
Audio signal processing and recognition
Cited by (28)
A full end-to-end deep approach for detecting and classifying jaw movements from acoustic signals in grazing cattle
2023, Engineering Applications of Artificial IntelligenceLearning and ensemble based MPC with differential dynamic programming for nuclear power autonomous control
2023, Expert Systems with ApplicationsCitation Excerpt :Since the time-dependent states and action samples are treated as independent in DNN, it is impossible to model the changes in time series. Therefore, a recurrent neural network (RNN) is proposed as an alternative (Lim, Jang, Lim, & Ko, 2019). In RNN, the output of neurons can directly act on themselves in the next time step, so the results of this action involve all the historical information.
A parameter-changing zeroing neural network for solving linear equations with superior fixed-time convergence
2022, Expert Systems with ApplicationsCitation Excerpt :Neural networks (NNs) provide an opportunity to solve this problem with their in-depth analysis and research. The neural network method is an efficient method for resolving LEs due to its parallel distribution characteristics, simplicity of electronic implementation, and adaptive ability (Lim et al., 2019; Yan et al., 2016). For resolving LEs, a gradient-based NN (GNN) solution method was reported in Wang (1992).
Investigation of acoustic and visual features for pig cough classification
2022, Biosystems EngineeringAutomatic classification of snoring sounds from excitation locations based on prototypical network
2022, Applied AcousticsCitation Excerpt :The learning strategy of the proposed prototypical network is a simple and effective model that uses the small dataset adequately. Mel-spectrogram expresses the real logarithm of short-term energy spectrum as the Mel-frequency scale [41] that characterizes the human ear perception of frequency by using a set of nonlinear spaced triangular bandpass filters. In this work, we extracted Mel-spectrograms of snoring sounds to feed into the prototypical network.