Towards heart sound classification without segmentation via autocorrelation feature and diffusion maps

https://doi.org/10.1016/j.future.2016.01.010Get rights and content

Highlights

  • A novel framework for heart sound classification without segmentation.

  • Extracting the autocorrelation features of the normalized average Shannon energy envelopes at different wavelet sub-bands.

  • Fusing the autocorrelation features into the uniform features by using diffusion maps and classifying them with the SVM classifier.

  • Evaluating the proposed method on two public datasets published in the PASCAL Classifying Heart Sounds Challenge.

Abstract

Heart sound classification, used for the automatic heart sound auscultation and cardiac monitoring, plays an important role in primary health center and home care. However, one of the most difficult problems for the task of heart sound classification is the heart sound segmentation, especially for classifying a wide range of heart sounds accompanied with murmurs and other artificial noise in the real world. In this study, we present a novel framework for heart sound classification without segmentation based on the autocorrelation feature and diffusion maps, which can provide a primary diagnosis in the primary health center and home care. In the proposed framework, the autocorrelation features are first extracted from the sub-band envelopes calculated from the sub-band coefficients of the heart signal with the discrete wavelet decomposition (DWT). Then, the autocorrelation features are fused to obtain the unified feature representation with diffusion maps. Finally, the unified feature is input into the Support Vector Machines (SVM) classifier to perform the task of heart sound classification. Moreover, the proposed framework is evaluated on two public datasets published in the PASCAL Classifying Heart Sounds Challenge. The experimental results show outstanding performance of the proposed method, compared with the baselines.

Introduction

Heart sound auscultation has been a critical part of the clinical examination since the invention of the stethoscope in 1816 by Lannec  [1]. The traditional heart auscultation, however, is over-dependent on the ear sensitivity and the subjective experience (auscultation skills) of the physician  [2]. Nowadays, heart sound classification, used for the automatic heart sound auscultation  [3] and cardiac monitoring  [4], becomes a promising research field based on the methods and techniques of modern signal processing and artificial intelligence  [5]. With the development and popularization of the electronic stethoscope and the smart phone (e.g., IPhone), heart sound classification plays an important role in primary health centers and home care.

The procedure of the heart sound classification usually consists of three steps: heart sound segmentation, feature extraction, and classification. The heart sound segmentation aims at segmenting the heart sound signal into a series of cardiac cycles. From each of cardiac cycles, the feature is extracted, which captures the information about the mechanical activity of the heart in one cardiac period. The extracted feature is input into the classifier, such as Artificial Neural Networks (ANN), Support Vector Machines (SVM) and Hidden Markov Models (HMMs), to identify the abnormal heart sound which usually relates to some heart condition. In some methods  [4], [6], [7], the heart sound segmentation was performed with the electrocardiogram (ECG) as a reference, where the ECG was recorded in parallel. By segmenting the heart sound signal into cardiac cycles according to the ECG, Ahlstrom  [6] focused on the murmur classification (distinguishing the pathological murmurs from the physiological murmurs) based on the recurrence quantification analysis (RQA) feature and ANN classifier. Jabbari  [7] also relied on the ECG to segment and classify the heart sounds with the feature, extracted by using matching pursuit, based on three-layer feed-forward multilayer perception (MLP) network. However, these methods with the ECG-based segmentation require to simultaneously record and synchronously process the heart sound and the ECG signal, which is very inconvenient, especially in the case of infants or newborn children  [8].

Recently, more classification methods without using the ECG signal are proposed. Among these methods, the segmentation with envelope analysis is the most popular one and widely used for extracting feature and classifying the heart sound. The envelope-based segmentation is performed by three steps: (1) extracting the envelope of the heart signal; (2) detecting the peaks of the fundamental heart sounds (FHS), S1 or S2; and (3) identifying the cardiac cycles with the peak conditioning. The envelope extraction algorithms used in the envelope-based classification methods are the normalized average Shannon energy  [9], Hilbert transform  [10], homomorphic filtering  [11], cardiac sound characteristic waveform extraction  [2], Hilbert–Huang transform  [12], short-time modified Hilbert transform  [13], etc. Moreover, to improve the robustness of detecting the FHS peaks, the original heart sound signal is usually represented in the transform domain by using some signal analysis approaches, such as short-time Fourier transform, discrete wavelet transform  [14], tunable-Q wavelet transform  [15], optimum multi-scale wavelet packet decomposition (OMS-WPD)  [16], S-transform  [17]. However, due to the unreliability of the peak conditioning for detecting and identifying the FHS peaks, the envelope-based methods mainly suffer from the two drawbacks  [18]. The first one is that the true FHS peaks are missed and the extra false peaks are detected, due to the affection of the murmur or background noise. The second is that the common assumption used in the peak conditioning, the systole period is shorter than diastole period, is not always true, especially, in the case of infants, newborn children, or some cardiac patients. Besides these envelope-based methods, other approaches based on the statistical models are also used for the heart sound segmentation in a supervised or unsupervised way, such as HMMs  [19], duration dependent HMMs  [20], Ensemble Empirical Mode  [21], K-means  [22], dynamic clustering  [23]. The nature of the model-based methods is to characterize or summarize the properties of the FHS by using their models, based on some discriminative information about the FHS, such as the distribution of time–frequency energy, the period (of systole, diastole, or cardiac cycle), and the temporal correlation (Markov property). Unfortunately, the properties of the FHS vary greatly from infants to old people and from healthy people to cardiac patients. It is difficult for the model-based methods to model all the FHS in a unified model, especially, accompanied with some artificial sounds in the real world.

In fact, the primary task of heart sound classification can be performed without heart sound segmentation, as done in  [18]. The goal of the primary task is only to detect the presence of a disorder in the heart sound rather than to further identify it, which is helpful to provide a primary diagnosis in the primary health center and home care. Certainly, the results of the primary diagnosis can also be used to later implement the automatic diagnosis. Based on the cardiac period estimated from the heart sound, Yuenyong  [18] extracted an equal number of cardiac cycles and classified them into two categories (normal and abnormal) using a neural network classifier. However, the accurate estimation of the cardiac period is also difficult, and the classification problem of heart sounds mixed with artificial sounds is not considered in their method.

In this study, a novel framework is proposed for the primary task of heart sound classification based on the diffusion maps [24], [25] and SVM classifier. The overall framework is shown in Fig. 1. Firstly, the pre-processed heart sound signal is decomposed into the approximation and detail coefficients by using the discrete wavelet decomposition (DWT). Then, with these coefficients, the normalized average Shannon energy envelopes and their autocorrelation functions are calculated, respectively. Thirdly, the sub-band autocorrelation functions, which are associated with the approximation and detail DWT coefficients, are fused by the diffusion maps to obtain the unified feature representation of the heart sound signal. Finally, the feature is input into the SVM for classifying the heart sound signal. In addition, the experiments are performed on the public datasets published in the PASCAL Classifying Heart Sounds Challenge  [26], and the proposed method is compared with the best threealgorithms  [26], UCI, J48, and MLP, presented in the Challenge competition.

The main contribution of the paper is twofold. (i) Contrasting to existing approaches, the proposed framework is the first one to perform the heart sound classification without using any location information in the heart sound signal, such as segmentation. Although the segmentation is not required in  [18], its implementation relies on the accurate estimation of the cardiac period which provides a reference to select the interval from the heart sound signal. However, estimation of the cardiac period is unnecessary in our method. (ii) We proposed a novel approach to fuse the autocorrelation features in different frequency bands based on diffusion maps. Our experiments show that the proposed framework based on the feature fusion is robust for classifying the heart sounds with artifact sound, strong murmurs, and noise.

Section snippets

Pre-processing and envelope extraction

The heart sound signal x(i) is first decimated to 2 kHz sampling frequency, and then is filtered with a band-pass, zero-phase, Butterworth filter order 6 (25–900 Hz) to eliminate out of the band noise. Next, the resulting signal xˆ(i) is normalized by x̄(i)=xˆ(i)max(|xˆ(i)|).

The decimated and normalized heart sound signal is decomposed into four levels by using the Order Six Daubechies (db6) wavelet, due to its morphological similarities to heart sound components  [27]. The approximation

Datasets

The proposed framework is applied on two public heart sound datasets published in the Classifying Heart Sounds Pascal Challenge competition  [26]. The first one, named Dataset-A, is collected from the volunteers of the iPhone users and recorded with the iStethoscope (an iPhone application software) in the real world situations. No information is available on the auscultated subjects, such as gender, age, and condition  [28]. The Dataset-A contains 176 records in WAV format with 44 100 Hz

Conclusion

This study proposed a novel framework for classifying heart sounds without segmenting them into cardiac cycles. The proposed framework was important and effective to provide the primary diagnosis of the automatic heart sound auscultation in the real world, before further identifying the special murmurs. Instead of characterizing the feature of cardiac cycles obtained by the segmentation, the sub-band autocorrelation features could capture the whole information of the heart sound signal based on

Acknowledgments

This work was supported in part by the Major Research plan of the National Natural Science Foundation of China (No. 91120303), National Natural Science Foundation of China (No. 91220301), Natural Science Foundation of Heilongjiang Province of China (No. F2015012), Academic Core Funding of Young Projects of Harbin Normal University of China (No. KGB201225), and Open Fund by Smart Education and Information Engineering (Harbin Normal University) (No. EIE2013-01).

Shi-Wen Deng received the B.E. degree from the Institute of Technology, Jia Mu Si University, JiaMuSi, China, in 1997, the M.E. from The School of Computer Science, Harbin Normal University, Harbin, China, in 2005, and the Ph.D. degree from The School of Computer Science, Harbin Institute of Technology in 2012. Currently, he is with the School of Mathematical Sciences, Harbin Normal University, Harbin, China. His research interests are in the area of speech and audio signal processing,

References (28)

  • A. Moukadem et al.

    A robust heart sounds segmentation module based on S-transform

    Biomed. Signal Process. Control

    (2013)
  • C.N. Gupta et al.

    Neural network classification of homomorphic segmented heart sounds

    Appl. Soft Comput.

    (2007)
  • H. Tang et al.

    Segmentation of heart sounds based on dynamic clustering

    Biomed. Signal Process. Control

    (2012)
  • A. Hamdy, H. Hefny, M.A. Salama, A.E. Hassanien, T.-H. Kim, The importance of handling multivariate attributes in the...
  • Cited by (120)

    • Research of heart sound classification using two-dimensional features

      2023, Biomedical Signal Processing and Control
    View all citing articles on Scopus

    Shi-Wen Deng received the B.E. degree from the Institute of Technology, Jia Mu Si University, JiaMuSi, China, in 1997, the M.E. from The School of Computer Science, Harbin Normal University, Harbin, China, in 2005, and the Ph.D. degree from The School of Computer Science, Harbin Institute of Technology in 2012. Currently, he is with the School of Mathematical Sciences, Harbin Normal University, Harbin, China. His research interests are in the area of speech and audio signal processing, including content-based audio analysis, noise suppression, speech/audio classification/detection.

    Ji-Qing Han received the B.S., M.S. in electrical engineering, and Ph.D. degrees in computer science from the Harbin Institute of Technology, Harbin, China, in 1987, 1990, and 1998, respectively. Currently, he is the associate dean of the School of Computer Science and Technology, Harbin Institute of Technology. He is a member of IEEE, member of the editorial board of Journal of Chinese Information Processing, and member of the editorial board of the Journal of Data Acquisition & Processing. Prof. Han is undertaking several projects from the National Natural Science Foundation, 863Hi-tech Program, National Basic Research Program. He has won three Second Prize and two Third Prize awards of Science and Technology of Ministry/Province. He has published more than 100 papers and 2 books. His research fields include speech signal processing and audio information processing.

    View full text