Keywords

1 Introduction

Over time, studies of the cardiac and the respiratory systems have provided a large number of tools to diagnose and improve the quality of life of the people. These contributions not only helped to deepen in the early detection of pathologies but have also generated studies of new technologies in the clinical field, and research for a better understanding of the cardiorespiratory system function [1,2,3].

Several studies show the relevance of the cardiac and respiratory dynamics depending on the posture [4,5,6], recognizing their impact on the diagnosis of some pathologies such as vertebral fracture [7, 8], how the blood pressure changes in resting conditions for hypertensive patients [9], or how the posture affects sedentary young people for the modulation of the autonomous heart rate [10]. However, to the best of the authors’ knowledge, there are no studies aimed to analyze changes of features of cardiac and respiratory systems depending on posture, which may lead to specify the best position for make a clinical examination. The identification of posture, based on the analysis of the cardiorespiratory dynamics, may also be an area of interest for clinical and non-clinical applications [11,12,13].

This document shows a statistical analysis of respiratory flow (FLW) and electrocardiographic (ECG) features, of healthy subjects, depending on the posture. Machine learning models are proposed for the identification of the posture based only in some features from ECG and FLW.

2 Materials and Method

2.1 HealthyDB Database

ECG and respiratory flow signals of 44 healthy subjects ranging in age from 22 to 33 years old were recorded under standardized resting conditions (quiet environment, same place) using BIOPAC System Inc. MP150 equipment. All records were made considering two positions: supine (for 30 min) and sitting (for 15 min). Table 1 shows demographic information of the subjects analyzed.

Table 1. Mean ± standard deviation of the physical data of the subjects grouped by gender.

All signals were recorded simultaneously, first in supine position and then in sitting position, with a five minutes pause between each record. For each one of the 44 healthy subjects, and each position (supine or sitting), five signals were obtained: four from the ECG – monopolar leads I, II, III and chest precordial lead – with a sampling frequency of 250 Hz, and one corresponding to FLW signal with a sampling frequency of 10 Hz. Figure 1 presents an excerpt of the ECG and FLW signals of a subject in supine position.

Fig. 1.
figure 1

Excerpt of ECG (leads I, II, III and chest) and FLW signals of a supine subject.

Records were preprocessed to detect and correct artifacts and outliers. Custom algorithms were applied to detect the events of the signals. Wrong detections were manually corrected whenever necessary.

2.2 Signal Processing

For each subject and for each signal we extracted time and frequency domain parameters to describe cardiac and respiratory activity. For the time domain, statistical and non-linear features were extracted. In addition, machine learning models were used to classify between the sitting and supine position from the cardiac and respiratory systems. Figure 2 shows a schematic representation of the process.

Fig. 2.
figure 2

Overview of the methodology used in this work.

ECG signals were pre-processed with a high pass filter with cutoff frequency of 0.2 Hz and a low pass filter with cutoff frequency of 40 Hz, to remove possible artifacts.

2.3 Temporal Features

In time domain, to characterize cardiac and respiratory dynamics, the following features were extracted: RR interval (distance between two consecutives R peaks), amplitude of R peaks from ECG signals; inspiratory time (TI), expiratory time (TE), and breath total time (TTot) from FLW signal. All these parameters were described in function of the mean, median, maximum, minimum, standard deviation, kurtosis and co-variance. In addition, Hjörth complexity and mobility [14], and Higuchi fractal dimension [15], were computed.

ECG

R peaks and RR intervals were the main features extracted from ECG records. These were obtained with the QRS complex detection, through Pan-Tompkins algorithm [16] (Fig. 3).

Fig. 3.
figure 3

ECG Lead II, with R peak detection for a subject in sitting position.

Once the R peaks were detected, the R-R intervals for each participant were found. For each R-R interval, the time of the first R peak of each one was assigned, later this signal was re-sampled at 10 Hz to obtain the HRV signal.

FLW

FLW records were analyzed taking into account three features: time of inspiration, time of expiration and total time. For these parameters, it was necessary to find the zero cuts of the signal, as can be seen in Fig. 4.

Fig. 4.
figure 4

Cuts by zero of the respiratory flow signal.

Complexity measurements

Complexity features allow us to obtain quantitative values related to the complex behavior of the cardiorespiratory system. These non-linear features allow an assessment of the signals which has been related with physiological and pathological states, i.e. epilepsy seizures, migraine, sustained attention, among others [17]. In particular, three features were computed: Hjörth complexity and mobility [14] and Higuchi fractal dimension [15].

2.4 Spectral Features

Power spectral density (PSD), estimated through Welch method [18], with 50% of overlapping and Hamming windowing, was computed to analyze the composition of the signal in frequency. Power of QRS (0.5 Hz to 4 Hz band), P and T waves (4 Hz to 8 Hz band) and half-power frequency in the ECG signals were obtained from PSD.

In addition, from HRV signal, very low frequency power (0 Hz to 0.004 Hz), low frequency power (0.04 to 0.15 Hz) and high frequency power (0.15 Hz to 0.4 Hz) were computed.

2.5 Statistical Analysis

Each signal was analyzed considering sliding windows of 30 s, with and over-lapping of 50%. A total of 187 features were extracted for each window. Table 2 presents the description of the temporal and spectral features extracted for each window and each signal.

Table 2. Features extracted

In order to identify the features with statistically significant differences, a parametric t-Student test was applied, with 5% significance level.

2.6 Classification Techniques

A model was trained from the data with some spectral and temporal features extracted from signals described in the Table 2. Only the features with significant differences were fed to the models.These models a holdout validation scheme, with 80% of the samples (windows) for training and 20% of the samples for testing. Three main machine learning techniques were used: decision trees, k-nearest neighbor and support vector machines:

  • Decision trees – Are flowchart-like structures in which each internal node represents a “test” on an attribute, each branch represents the outcome of the test, and each leaf node represents a class label, this means, decision is taken after computing all attributes. The paths from root to leaf represent classification rules [19].

    In this study, three types of decision trees were implemented: fine, with a maximum number of splits equals to 100; medium with 20 maximum splits; and coarse, with only 4 maximum number of splits.

  • K-Nearest Neighbor (KNN) uses a predictive model. The input consists of the k closest samples in the feature space of study, and the output is a class membership. An object is classified by a proximity of its neighbors, being assigned to the class most common among its k nearest neighbors [20].

    Five different KNN models were trained: fine, medium and coarse KNN, varying the parameter k with values 1, 10 and 100. In addition, a cosine KNN (cosine distance metric) with k equals 10 and a weighted KNN (different weights based on distance) with k equals 10 were trained.

  • Support Vector Machines (SVM) are based on transforming data into a higher dimensional space to convert a complex classification problem into a simpler one that can be solved by a linear discriminant function, known as a hyperplane, and defined by [21, 22]

    $$ f\left( x \right) = wz + b = \mathop \sum \limits_{i}^{L} \alpha_{i} y_{i} K\left( {x_{i} y_{i} } \right) + b $$

    where w is the normal vector to the hyperplane. The function K(xi yi) is the Kernel function that will shape the hyperplane and αi and b define the efficiency of the classifier on the optimal values. In this study we evaluated linear and quadratic and cubic kernels.

3 Results

3.1 Statistical Analysis

Once the t-student test is done, it was obtained that 148 features present significant differences from the 187 total features.

In the bar diagram shown in Fig. 5, the average value of some of the features of interest is presented. For the implementation of a machine learning model, only those features, with significant differences were taken into account.

Fig. 5.
figure 5

Average values of the features.

As it can be seen in Fig. 5, the average values of the features do not show large differences between the sitting and supine posture, and their standard deviation is very large, however the parametric test determined that these allow to describe the physiologic behavior depending on the position, and through machine learning model corroborated that it can be determined if a subject is in a sitting or supine posture using their cardiac and respiratory signals.

3.2 Machine Learning Models

Table 3 describes accuracy scores according to trained models, and linear Support Vector Machine (SVM) shows the highest accuracy. Overall, accuracy is over 93%.

Table 3. Accuracy scores and training time according to training models.

As shown in Table 3, the best performance was obtained for the linear SVM model. Its sensitivity was 99.2% and its specificity was 99.6%.

A great variety of studies have been carried out focused on the analysis of posture through the use of sensors, image capture or physiological records, and implementing models of automatic learning. Some of the applications of these models is facial recognition, classification of gestures, posture correction or monitoring while driving, among others [23,24,25]. However, this study seeks to determine how physiological signals can be affected by posture, and thereby may provide evidence to investigate if there is an adequate position to perform clinical studies with greater clarity.

It was also possible to train a machine learning model that allows to classify between the supine and sitting position from the cardiac and respiratory signals, in order to provide monitoring tools to the medical area. In other studies, the analysis of the posture can be determined using sensors incorporated in everyday objects, in combination with machine learning models [25]; however, a continuous monitoring is not always possible.

4 Conclusions

It is possible to use statistical and computer tools for design machine learning models that allow us to identify the subject posture with an accuracy of 99.5%. Future works may use spectral and temporal analysis with other physiological signals such as Electromyography (EMG), Electrogastrography (EGG), Electroretinography, among others, in order to validate how the trace of the signal is being affected according to the posture.

A machine learning model capable of the identification of the posture of the subject based on their cardiac and respiratory signals, with an accuracy of 99.5%, provides a tool for clinical applications. For instance, in the case of a patient with restricted mobility, the proposed model may warn clinical staffs when the subject has a harmful posture. Also using the machine learning model, posture of subjects wearing intelligent garments in their house could be determined.