A comparative study of PCA, SIMCA and Cole model for classification of bioimpedance spectroscopy measurements
Introduction
Bioimpedance spectroscopy (BIS) is a safe, non-invasive and low cost method to measure electrical responses of living tissues to a low-level, alternating current at a range of frequencies. Single-frequency and multi-frequency bioimpedance have been measured and modeled in many research applications to explain how different sources of variation affect the response of a living tissue [1], [2]. These applications have been diverse including impedance plethysmography and pulsatile blood flow [3], [4], assessment of human body composition [5], hydration detection, characterization of fluid accumulation [6], [7] and detection of electrical anomalies in neuromuscular diseases [8].
However, classification of measured bioimpedance spectra in order to make a diagnostic decision is a particular challenge [9], [10]. The signal is recorded on the surface of a part of the body and reflects the internal phenomena in the body. Several decisions regarding proper classifications of these signals need to be made including: (1) determining essential features or codes that encompass the information relevant to the diagnosis decision, (2) with a set of given features, determining kind of classifier to be trained to form the best decision boundaries in the feature space and to discriminate different classes of the recorded signal, (3) setting the required accuracy for a set of new recorded signals.
These are typical steps of the so-called classification task in the field of signal processing. In contrast to widespread used classification methods applied to ECG and EEG signals, standard classification methods for BIS data have not been established yet [11], [12].
Cole parameters are the most common descriptive parameters extracted from bioimpedance measurements. In this method the measured spectrum is fitted onto an equivalent electrical circuit and the information of spectrum is summarized to four descriptive features based on similarities between the electrical circuits and the biological tissues. Cole parameters have been basically introduced and applied as explanatory features of bioimpedance measurements and have been known to have limitations [13]. Simplicity, popularity and explanatory power of these features make them the first choice and the standard way of descriptive feature extraction, as well. However, researchers found that Cole parameters extracted from repetitive measurements on the same subject are too variable and thus not effective in achieving a classification task [14], [15].
Principal Component Analysis (PCA) and Soft Independent Modeling of Class Analogy (SIMCA) are two techniques that have been reported before as successful feature extraction methods in BIS classification tasks [16], [17], [18], [19], [20], [21]. PCA maps the whole data on its orthogonal and uncorrelated eigenvectors and extracts the most salient part of information on first eigenvectors as features. SIMCA implements the same decorrelation, but within each class separately and considers the distance between each data point and eigenvectors of each individual class as features. For example, in [16], PCA and linear classifiers were applied for skin type classification. First the measurements were classified to men and women based on PCA scores of capacitance. Then skin type classification was achieved for men and women separately, using PCA scores of impedance. In [17], SIMCA was applied on the magnitude of impedance to classify different skin types. In [18], PCA was applied on complex data considering each complex measurement as a set of two real values (real and imaginary) and the components were fed to a Linear Discriminate Analysis (LDA) classifier to classify different types of fish. In [19] SIMCA was applied on complex values consisting of magnitude and phase in order to classify contracted and relaxed muscles. In this work, the authors have found SIMCA to be more effective than PCA in achieving the classification task. In [14] the authors have used electrical features and decision trees to achieve a successful classification for diagnosing stroke in the brain.
The reason why most of researchers prefer to use Cole parameters and linear discriminant classifiers is the high explanatory power of these methods, which enables us to interpret our observations based on related biophysical phenomena. However, increasing number of applications of PCA-based feature extraction methods in BIS classification suggests that there is an emerging trend toward using signal processing methods in achieving classification of measured bioimpedance spectra. As mentioned in [22], in some applications of bioimpedance high classification performance compensates the need for an explanatory model. In previous works PCA has only been used in combination with linear classifiers. However, if the objective is to achieve high classification accuracy without the need of providing physical explanations, then using more powerful classifiers is a promising option. Also neither PCA nor SIMCA has been compared quantitatively in specific classification tasks, which is essential in order to establish new methods as standard techniques.
In this work, we first compare the above-mentioned methods in achieving three synthetic benchmark classification tasks. In each classification task we simulate two classes of data related to the tissue before and after a specific change that is known to affect the frequency response of the tissue. This change is expected to be detected by bioimpedance spectroscopy. These three changes are change in relative composition of muscle and fat in the tissue, blood perfusion in a tissue consisting of muscle and fat and change of the tissue geometry. For each classification task, an equivalent electrical circuit is used to synthesize a number of sample spectra related to before and after the change. PCA and Cole methods are then applied and compared in classifying these simulated datasets. Secondly, these methods are combined with four different types of classifiers (linear, quadratic, decision tree and K-nearest neighbors) and compared in classification of a set of experimentally measured bioimpedance data which in theory involves all three above-mentioned changes. This dataset includes longitudinal bioimpedance measurements of arm for eight subjects, in three different arm positions. Classification task involves classifying measured bioimpedance spectra to detect arm positions.
The rest of this paper is organized as follows. In Section 2, we focus on explaining feature extraction methods used in this paper which are Cole method, PCA and SIMCA. We also clarify what we mean by the term classification. In Section 3, we explain all the methods that we used in simulating and measuring BIS data, as well as classification tasks that we performed to compare the classification methods. Section 4 presents all the obtained results and compares the proposed methods quantitatively in classification tasks given in Section Section 3. Section 5 includes conclusion, discussion and suggestions for future work.
Section snippets
Cole feature extraction
Based on Cole model introduced in [23] impedance of a tissue can be written aswhere, Z(f) is the complex impedance at frequency f, and represent resistivity of tissue at zero and infinity frequency, respectively, fc is the characteristic frequency related to the relaxation time of electrical dispersion and α is a constant added to this mathematical model to explain molecular interactions. represents the absence of molecular interaction and corresponds to an ideal
Data simulation
In this work, various sets of bioimpedance of a cylindrical body organ are simulated using an electrical circuit, in the same way as introduced in [28]. In this model each type of tissue is represented by a RC circuit composed of equivalent conductance and equivalent capacitance of the tissue. These two parameters are dependent on the frequency, f, and are calculated using the relative permittivity, , and relative conductivity, of that type of tissue. For the frequency range of 5 kHz
Geometry classification
All the simulations and analysis in this work are implemented in MATLAB 7.10 (R2010a). We fitted the simulated prototype spectrum for each class described in Section 3.1.2 to the Cole model and extracted the following parameters: for Class 1 the prototype spectrum is parametrized to , 0.72, 353 kHz]and Class 2 is parametrized to , 0.72, 353 kHz]. Considering these parameters we conclude that geometry change has altered the parameters R0 and , but it did
Discussion and conclusions
In this work, we investigated whether PCA and SIMCA could be considered as standard methods of feature extraction instead of Cole modeling, in classification of BIS measurements. Some works had reported SIMCA or PCA features in combination with linear classifiers as promising techniques for classification of BIS. However, these methods have not been studied in combination with other classifiers and have not been compared with Cole features for a specific task. In most research applications of
Conflict of interest statement
We, Isar Nejadgholi and Miodrag Bolic authors of the manuscript “A Comparative Study of PCA, SIMCA and Cole model for Classification of Bioimpedance Spectroscopy Measurements” certify that there is no conflict of interest with any financial organization regarding the material discussed in the manuscript. Also the protocol for measuring bioimpedance of human subjects’ arm used in this study was approved by Research Ethics Board (REB) of university of Ottawa.
Acknowledgments
The authors would like to acknowledge Nuraleve Inc., Mitacs and NSERC for supporting this work. They are also grateful to Dr. Andy Adler for his generous donation of Lab space and equipment for performing the experiments.
References (31)
- et al.
Using phase space reconstruction for patient independent heartbeat classification in comparison with some benchmark methods
Comput. Biol. Med.
(2011) - et al.
Multivariate analysis of skin impedance data in long-term type 1 diabetic patients
Chemom. Intell. Lab. Syst.
(1998) - et al.
Bioelectrical impedance analysis of frozen sea bass
J. Food Eng.
(2008) Cross-validation methods
J. Math. Psychol.
(2000)- et al.
Accuracy of direct segmental multi-frequency bioimpedance analysis in the assessment of total body and segmental body composition in middle-aged adult population
Clin. Nutr.
(2011) - et al.
Clinical characteristics influencing analysis measurements
Am. J. Clin. Nutr.
(1996) - et al.
The theory and fundamentals of bioimpedance analysis in clinical status monitoring and diagnosis of diseases
Sensors
(2014) - B.M. Eyüboğlu, Electrical impedance plethysmography, in: Wiley Encyclopedia of Biomedical Engineering,...
- T. Dai, A. Adler, Blood impedance characterization from pulsatile measurements, in: Canadian Conference on Electrical...
- et al.
Body composition in 18- to 88-year-old adults-comparison of multifrequency bioimpedance and dual-energy X-ray absorptiometry
Obesity
(2014)
Evaluation of clinical dry weight assessment in haemodialysis patients using bioimpedance spectroscopya cross-sectional study
Nephrol. Dial. Transplant.
A method for the estimation of hydration state during hemodialysis using a calf bioimpedance technique
Physiol. Meas.
Assessing neuromuscular disease with multifrequency electrical impedance myography
Muscle Nerve
Classifying biosignals with wavelet networks
IEEE Eng. Med. Biol.
On the classification of emotional biosignals evoked while viewing affective picturesan integrated data-mining-based approach for healthcare applications
IEEE Trans. Inf. Technol. Biomed.
Cited by (21)
Volatile compounds, texture, and color characterization of meatballs made from beef, rat, wild boar, and their mixtures
2022, HeliyonCitation Excerpt :This study determined the volatile compounds and physicochemical profile of meatballs made from halal (beef), and non-halal (rats and wild boar) animals, as well as their combinations at various compositions using SPME-GC/MS. The discriminating volatile compounds for each group of samples were determined using multivariate data analysis. PCA, an unsupervised feature of multivariate data analysis, was used as a first-pass method to identify differences in volatile compounds of meatballs [23]. The combined use of PCA and PLS-DA in data processing provide valuable insights into general spectral trends and predictive spectral features of the group of the meat type under study [24].
Volatilomics for halal and non-halal meatball authentication using solid-phase microextraction–gas chromatography–mass spectrometry
2021, Arabian Journal of ChemistryCitation Excerpt :SIMCA is a supervised method used to extract features and obtain classification tasks, according to which the training data are labeled, and the method is then separately applied to each data class. SIMCA has been demonstrated to be a superior method when working with larger data sets, whereas PCA and PLS-DA are more suitable for classification tasks when one has limited access to data (Nejadgholi and Bolic, 2015). PLS-DA is often used in metabolomics research to build predictive classification models and/or discover biomarkers.
Learning distance to subspace for the nearest subspace methods in high-dimensional data classification
2019, Information SciencesCitation Excerpt :In this paper, we term the PC subspace-based classification methods with the classification rule using distances the “nearest subspace methods” (NSM). The nearest subspace classifier (NSC) [3,4,11,13,25] and soft independent modelling of class analogy (SIMCA) [2,5,12,16,18,22] are two famous examples of NSM. NSC and SIMCA both adopt PC subspace as the low-dimensional class subspace, however, they use different classification rules to classify a test sample.
Rapid classification of heavy metal-exposed freshwater bacteria by infrared spectroscopy coupled with chemometrics using supervised method
2018, Spectrochimica Acta - Part A: Molecular and Biomolecular SpectroscopyCitation Excerpt :Testing and comparison of the results with correct input models are not applicable in PCA. However, SIMCA confidentially deals with each sample but within each class separately, and reflects the distance between every data point to determine whether or not the sample belongs to the corresponding class that enhances the classification power and accuracy [40]. Therefore, to ascertain the success of the PCA differentiation and to perform classification, we further proceeded SIMCA analysis.
Selection of robust variables for transfer of classification models employing the successive projections algorithm
2017, Analytica Chimica ActaCitation Excerpt :Multivariate models have been widely used in analytical problems involving quantitative and qualitative analyzes of a variety of matrices [1–10].
Bioimpedance and Bioelectricity Basics, Fourth Edition
2023, Bioimpedance and Bioelectricity Basics, Fourth Edition