Position PaperScalogram based prediction model for respiratory disorders using optimized convolutional neural networks
Introduction
Lung disease is the third largest cause of death in the world. World Health Organization (WHO), report says, more than 3 million people have lost their life due to chronic obstructive pulmonary diseases (COPDs) and lower respiratory infections while death rate is 1.7 million people due to tracheal, bronchus and lung cancer [1]. The attributes of the respiratory sounds and its investigation plays a significant factor in the pulmonary disorders. Therapeutic specialists adopt several methods, such as spirometry, plethysmography, and arterial-blood gas analysis to diagnose the lung sound attributes. Despite this, most of the available techniques are not always conducive [2]. Listening to lung sounds is a significant section of the lung investigation and it is supportive in analyzing different respiratory disorders. The technique of auscultation of the lung system does not require any skin incision; it is more economical [3] and much secured method [4] and the earliest diagnostic techniques used by the specialists to analyze diverse pulmonary diseases [5].
Lung sounds are hugely non-stationary and does not occur at regular intervals (non-periodic) in description due to the disorderly outflow and variation in volume of air. The lung sounds are primarily classified in two groups: normal (vesicular) and abnormal (adventitious) sounds. When there are no respiratory disorders, vesicular breathing sounds are noted. Abnormal types of sounds are supplementary sounds that are observed upon vesicular sounds and they are usually markers of complications in the lungs or airways. Some of the abnormal breath sounds include low-pitched wheezes called Rhonchi, high-pitched crackle, high-pitched wheezing (due to contraction of the bronchial tubes) and harsh sound stridor (caused by reduction of the upper airway).
The American Thoracic Society [6] defines that wheeze types of sounds occur over the frequency beyond 400 Hz, whereas Rhonchi occur at a frequency at about 200 Hz. The wheeze can be classified into monophonic (single frequency) and polyphonic (indefinite frequencies). Wheezes can be either high or low pitched and some of the disorders correlated with wheezing sounds are pneumonia, asthma and bronchitis. On the other hand, Crackles (coarse and fine) are irregular abnormal sounds induced by the rapid split of collapsed small air passage and are observed in patients affected with diseases like pneumonia, fibrosis, and heart failure.
The statistics collected from the surroundings is mostly non-linear in description, and therefore traditional methods cannot be used to devise analytical models. In the past decade, intelligent systems were used to figure out this issue but resulted with high error rate [7]. With the help of deep learning algorithms, the error rates can eventually become negligible as it handles enormous amounts of unorganized data [8,9]. Several investigations in this field strive to produce better illustrations and generate prototype to study from data without labels [10]. Deep learning is a framework used for training neural networks and it is examined to be deep if the input data is passed through a sequence of nonlinear transformations using various model architectures [11].
With deep learning models, features can be naturally learned and classified by feeding the raw data instantaneously into a deep neural network. The major deep learning architectures are Unsupervised Pre-trained networks (UPNs), Convolutional Neural Networks (CNNs), recurrent and recursive neural networks and have been applied in the areas including audio and speech signal processing and natural language processing. Among these, the Convolutional Neural Network architecture is a well-known and commonly used network for classifying images. In this paper, to extract the visual details from the pixel values of lung sound images and for accurate classification and detection, a scalogram based optimized Alexnet pre-trained Convolutional Neural Network model is developed.
The remainder of this paper is organized in this fashion: related work on the classification of respiratory sounds using different approaches is introduced in Section II; in Section III, the proposed prediction model is described, the database description and the numerical results are shown in Section IV; and at the end outcomes are given in Section V.
Section snippets
Related work
In the literature, several efforts have been reported for classifying the normal and abnormal lung sounds. They are broadly categorized based on different time-frequency transforms, disparate set of features and various classification methods. Some of the efforts on classification algorithms using extracted features are as follows: In [12], lung sounds were classified using Multi-Layer Perceptron (MLP) by employing Fourier transform for feature extraction, however, only wheezes were identified.
Proposed methodology
The framework of the proposed prediction model is shown in Fig. 1. The proposed approach transforms the segmented respiratory sound into Bump and Morse scalograms and several intrinsic mode functions using the Empirical mode decomposition method. From the extracted intrinsic mode functions, the percentage energy calculated for each wavelet coefficient in the form of scalograms are input to the pre-trained optimized convolutional neural network for training and testing. Stochastic gradient
Results and discussion
The scalograms extracted from the audio files through continuous wavelet transforms and EMD technique were trained for different iterations through pre-trained Alexnet with different epochs 2,4,8,16 & 20. Two optimization methods, the Stochastic Gradient Descent with Momentum and Adaptive Moment estimation were used for training. The experimental settings for modeling the network and Alexnet architecture are listed below in the Table 1, Table 2.
The training loss and test accuracy curves versus
Conclusion
In this paper, a prediction model with Alexnet pre-trained Convolutional Neural Networks using bump and morse scalograms created from IMFs extracted by the method of EMD is proposed. EMD is chosen as the preferred domain in this work as this decomposition technique by the virtue of its adaptive nature, treats the entire signal components in an unbiased manner irrespective of the pattern of basis function. Experimental results show that scalograms derived from IMFs of EMD, when given as input to
Declaration of Competing Interest
NIL.
References (31)
- et al.
Neural classification of lung sounds using wavelet coefficients
Comput Biol Med
(2004) - et al.
An integrated automated system for crackles extraction and classification
Biomed Signal Process Control
(2008) - et al.
Pulmonary crackle detection using time–frequency and time–scale analysis
Digit Signal Process
(2013) - et al.
Assessment of time–frequency representation techniques for thoracic sounds analysis
Comput Methods Programs Biomed
(2014) - et al.
Lung sound classification using cepstral-based statistical features
Comput Biol Med
(2016) - et al.
Lung sounds classification using convolutional networks
Artif Intell Med
(2018) - ...
- et al.
Fundamentals of lung auscultation
N Engl J Med
(2014) - et al.
The relationship between normal lung sounds, age and gender
Am J Respir Crit Care Med
(2000) - et al.
Auscultation of the respiratory system
Ann Thorac Med
(2015)
Wheezes
Eur Respir J
Representation learning: a review and new perspectives
IEEE Trans Pattern Anal Mach Intell
The history began from AlexNet: a comprehensive survey on deep learning approaches
Computer Vision and Pattern Recognition
An approach to develop expert systems in medical diagnosis using machine learning algorithms (asthma) and a performance study
International Journal on Soft Computing (IJSC)
Cited by (61)
A quality detection method of corn based on spectral technology and deep learning model
2024, Spectrochimica Acta - Part A: Molecular and Biomolecular SpectroscopyArtificial intelligence approaches to physiological parameter analysis in the monitoring and treatment of non-communicable diseases: A review
2024, Biomedical Signal Processing and ControlLung anomaly detection from respiratory sound database (sound signals)
2023, Computers in Biology and MedicineBayesian optimized GoogLeNet based respiratory signal prediction model from empirically decomposed gammatone visualization
2023, Biomedical Signal Processing and Control