Elsevier

Computers in Biology and Medicine

Volume 63, 1 August 2015, Pages 42-51
Computers in Biology and Medicine

A comparative study of PCA, SIMCA and Cole model for classification of bioimpedance spectroscopy measurements

https://doi.org/10.1016/j.compbiomed.2015.05.004Get rights and content

Abstract

Due to safety and low cost of bioimpedance spectroscopy (BIS), classification of BIS can be potentially a preferred way of detecting changes in living tissues. However, for longitudinal datasets linear classifiers fail to classify conventional Cole parameters extracted from BIS measurements because of their high variability. In some applications, linear classification based on Principal Component Analysis (PCA) has shown more accurate results. Yet, these methods have not been established for BIS classification, since PCA features have neither been investigated in combination with other classifiers nor have been compared to conventional Cole features in benchmark classification tasks.

In this work, PCA and Cole features are compared in three synthesized benchmark classification tasks which are expected to be detected by BIS. These three tasks are classification of before and after geometry change, relative composition change and blood perfusion in a cylindrical organ. Our results show that in all tasks the features extracted by PCA are more discriminant than Cole parameters. Moreover, a pilot study was done on a longitudinal arm BIS dataset including eight subjects and three arm positions. The goal of the study was to compare different methods in arm position classification which includes all three synthesized changes mentioned above. Our comparative study on various classification methods shows that the best classification accuracy is obtained when PCA features are classified by a K-Nearest Neighbors (KNN) classifier. The results of this work suggest that PCA+KNN is a promising method to be considered for classification of BIS datasets that deal with subject and time variability.

Introduction

Bioimpedance spectroscopy (BIS) is a safe, non-invasive and low cost method to measure electrical responses of living tissues to a low-level, alternating current at a range of frequencies. Single-frequency and multi-frequency bioimpedance have been measured and modeled in many research applications to explain how different sources of variation affect the response of a living tissue [1], [2]. These applications have been diverse including impedance plethysmography and pulsatile blood flow [3], [4], assessment of human body composition [5], hydration detection, characterization of fluid accumulation [6], [7] and detection of electrical anomalies in neuromuscular diseases [8].

However, classification of measured bioimpedance spectra in order to make a diagnostic decision is a particular challenge [9], [10]. The signal is recorded on the surface of a part of the body and reflects the internal phenomena in the body. Several decisions regarding proper classifications of these signals need to be made including: (1) determining essential features or codes that encompass the information relevant to the diagnosis decision, (2) with a set of given features, determining kind of classifier to be trained to form the best decision boundaries in the feature space and to discriminate different classes of the recorded signal, (3) setting the required accuracy for a set of new recorded signals.

These are typical steps of the so-called classification task in the field of signal processing. In contrast to widespread used classification methods applied to ECG and EEG signals, standard classification methods for BIS data have not been established yet [11], [12].

Cole parameters are the most common descriptive parameters extracted from bioimpedance measurements. In this method the measured spectrum is fitted onto an equivalent electrical circuit and the information of spectrum is summarized to four descriptive features based on similarities between the electrical circuits and the biological tissues. Cole parameters have been basically introduced and applied as explanatory features of bioimpedance measurements and have been known to have limitations [13]. Simplicity, popularity and explanatory power of these features make them the first choice and the standard way of descriptive feature extraction, as well. However, researchers found that Cole parameters extracted from repetitive measurements on the same subject are too variable and thus not effective in achieving a classification task [14], [15].

Principal Component Analysis (PCA) and Soft Independent Modeling of Class Analogy (SIMCA) are two techniques that have been reported before as successful feature extraction methods in BIS classification tasks [16], [17], [18], [19], [20], [21]. PCA maps the whole data on its orthogonal and uncorrelated eigenvectors and extracts the most salient part of information on first eigenvectors as features. SIMCA implements the same decorrelation, but within each class separately and considers the distance between each data point and eigenvectors of each individual class as features. For example, in [16], PCA and linear classifiers were applied for skin type classification. First the measurements were classified to men and women based on PCA scores of capacitance. Then skin type classification was achieved for men and women separately, using PCA scores of impedance. In [17], SIMCA was applied on the magnitude of impedance to classify different skin types. In [18], PCA was applied on complex data considering each complex measurement as a set of two real values (real and imaginary) and the components were fed to a Linear Discriminate Analysis (LDA) classifier to classify different types of fish. In [19] SIMCA was applied on complex values consisting of magnitude and phase in order to classify contracted and relaxed muscles. In this work, the authors have found SIMCA to be more effective than PCA in achieving the classification task. In [14] the authors have used electrical features and decision trees to achieve a successful classification for diagnosing stroke in the brain.

The reason why most of researchers prefer to use Cole parameters and linear discriminant classifiers is the high explanatory power of these methods, which enables us to interpret our observations based on related biophysical phenomena. However, increasing number of applications of PCA-based feature extraction methods in BIS classification suggests that there is an emerging trend toward using signal processing methods in achieving classification of measured bioimpedance spectra. As mentioned in [22], in some applications of bioimpedance high classification performance compensates the need for an explanatory model. In previous works PCA has only been used in combination with linear classifiers. However, if the objective is to achieve high classification accuracy without the need of providing physical explanations, then using more powerful classifiers is a promising option. Also neither PCA nor SIMCA has been compared quantitatively in specific classification tasks, which is essential in order to establish new methods as standard techniques.

In this work, we first compare the above-mentioned methods in achieving three synthetic benchmark classification tasks. In each classification task we simulate two classes of data related to the tissue before and after a specific change that is known to affect the frequency response of the tissue. This change is expected to be detected by bioimpedance spectroscopy. These three changes are change in relative composition of muscle and fat in the tissue, blood perfusion in a tissue consisting of muscle and fat and change of the tissue geometry. For each classification task, an equivalent electrical circuit is used to synthesize a number of sample spectra related to before and after the change. PCA and Cole methods are then applied and compared in classifying these simulated datasets. Secondly, these methods are combined with four different types of classifiers (linear, quadratic, decision tree and K-nearest neighbors) and compared in classification of a set of experimentally measured bioimpedance data which in theory involves all three above-mentioned changes. This dataset includes longitudinal bioimpedance measurements of arm for eight subjects, in three different arm positions. Classification task involves classifying measured bioimpedance spectra to detect arm positions.

The rest of this paper is organized as follows. In Section 2, we focus on explaining feature extraction methods used in this paper which are Cole method, PCA and SIMCA. We also clarify what we mean by the term classification. In Section 3, we explain all the methods that we used in simulating and measuring BIS data, as well as classification tasks that we performed to compare the classification methods. Section 4 presents all the obtained results and compares the proposed methods quantitatively in classification tasks given in Section Section 3. Section 5 includes conclusion, discussion and suggestions for future work.

Section snippets

Cole feature extraction

Based on Cole model introduced in [23] impedance of a tissue can be written asZ(f)=R+R0R1+j(ffc)αwhere, Z(f) is the complex impedance at frequency f, R0 and R represent resistivity of tissue at zero and infinity frequency, respectively, fc is the characteristic frequency related to the relaxation time of electrical dispersion and α is a constant added to this mathematical model to explain molecular interactions. α=1 represents the absence of molecular interaction and corresponds to an ideal

Data simulation

In this work, various sets of bioimpedance of a cylindrical body organ are simulated using an electrical circuit, in the same way as introduced in [28]. In this model each type of tissue is represented by a RC circuit composed of equivalent conductance and equivalent capacitance of the tissue. These two parameters are dependent on the frequency, f, and are calculated using the relative permittivity, ϵr(f), and relative conductivity, σr(f) of that type of tissue. For the frequency range of 5 kHz

Geometry classification

All the simulations and analysis in this work are implemented in MATLAB 7.10 (R2010a). We fitted the simulated prototype spectrum for each class described in Section 3.1.2 to the Cole model and extracted the following parameters: for Class 1 the prototype spectrum is parametrized to [R0,R,α,Fc]=[6Ω,3Ω, 0.72, 353 kHz]and Class 2 is parametrized to [R0,R,α,Fc]=[9.1Ω,4.5Ω, 0.72, 353 kHz]. Considering these parameters we conclude that geometry change has altered the parameters R0 and R, but it did

Discussion and conclusions

In this work, we investigated whether PCA and SIMCA could be considered as standard methods of feature extraction instead of Cole modeling, in classification of BIS measurements. Some works had reported SIMCA or PCA features in combination with linear classifiers as promising techniques for classification of BIS. However, these methods have not been studied in combination with other classifiers and have not been compared with Cole features for a specific task. In most research applications of

Conflict of interest statement

We, Isar Nejadgholi and Miodrag Bolic authors of the manuscript “A Comparative Study of PCA, SIMCA and Cole model for Classification of Bioimpedance Spectroscopy Measurements” certify that there is no conflict of interest with any financial organization regarding the material discussed in the manuscript. Also the protocol for measuring bioimpedance of human subjects’ arm used in this study was approved by Research Ethics Board (REB) of university of Ottawa.

Acknowledgments

The authors would like to acknowledge Nuraleve Inc., Mitacs and NSERC for supporting this work. They are also grateful to Dr. Andy Adler for his generous donation of Lab space and equipment for performing the experiments.

References (31)

  • J. Passauer et al.

    Evaluation of clinical dry weight assessment in haemodialysis patients using bioimpedance spectroscopya cross-sectional study

    Nephrol. Dial. Transplant.

    (2010)
  • F. Zhu et al.

    A method for the estimation of hydration state during hemodialysis using a calf bioimpedance technique

    Physiol. Meas.

    (2008)
  • G.J. Esper et al.

    Assessing neuromuscular disease with multifrequency electrical impedance myography

    Muscle Nerve

    (2006)
  • H. Dickhaus et al.

    Classifying biosignals with wavelet networks

    IEEE Eng. Med. Biol.

    (1996)
  • C. Frantzidis et al.

    On the classification of emotional biosignals evoked while viewing affective picturesan integrated data-mining-based approach for healthcare applications

    IEEE Trans. Inf. Technol. Biomed.

    (2010)
  • Cited by (21)

    • Volatile compounds, texture, and color characterization of meatballs made from beef, rat, wild boar, and their mixtures

      2022, Heliyon
      Citation Excerpt :

      This study determined the volatile compounds and physicochemical profile of meatballs made from halal (beef), and non-halal (rats and wild boar) animals, as well as their combinations at various compositions using SPME-GC/MS. The discriminating volatile compounds for each group of samples were determined using multivariate data analysis. PCA, an unsupervised feature of multivariate data analysis, was used as a first-pass method to identify differences in volatile compounds of meatballs [23]. The combined use of PCA and PLS-DA in data processing provide valuable insights into general spectral trends and predictive spectral features of the group of the meat type under study [24].

    • Volatilomics for halal and non-halal meatball authentication using solid-phase microextraction–gas chromatography–mass spectrometry

      2021, Arabian Journal of Chemistry
      Citation Excerpt :

      SIMCA is a supervised method used to extract features and obtain classification tasks, according to which the training data are labeled, and the method is then separately applied to each data class. SIMCA has been demonstrated to be a superior method when working with larger data sets, whereas PCA and PLS-DA are more suitable for classification tasks when one has limited access to data (Nejadgholi and Bolic, 2015). PLS-DA is often used in metabolomics research to build predictive classification models and/or discover biomarkers.

    • Learning distance to subspace for the nearest subspace methods in high-dimensional data classification

      2019, Information Sciences
      Citation Excerpt :

      In this paper, we term the PC subspace-based classification methods with the classification rule using distances the “nearest subspace methods” (NSM). The nearest subspace classifier (NSC) [3,4,11,13,25] and soft independent modelling of class analogy (SIMCA) [2,5,12,16,18,22] are two famous examples of NSM. NSC and SIMCA both adopt PC subspace as the low-dimensional class subspace, however, they use different classification rules to classify a test sample.

    • Rapid classification of heavy metal-exposed freshwater bacteria by infrared spectroscopy coupled with chemometrics using supervised method

      2018, Spectrochimica Acta - Part A: Molecular and Biomolecular Spectroscopy
      Citation Excerpt :

      Testing and comparison of the results with correct input models are not applicable in PCA. However, SIMCA confidentially deals with each sample but within each class separately, and reflects the distance between every data point to determine whether or not the sample belongs to the corresponding class that enhances the classification power and accuracy [40]. Therefore, to ascertain the success of the PCA differentiation and to perform classification, we further proceeded SIMCA analysis.

    • Selection of robust variables for transfer of classification models employing the successive projections algorithm

      2017, Analytica Chimica Acta
      Citation Excerpt :

      Multivariate models have been widely used in analytical problems involving quantitative and qualitative analyzes of a variety of matrices [1–10].

    • Bioimpedance and Bioelectricity Basics, Fourth Edition

      2023, Bioimpedance and Bioelectricity Basics, Fourth Edition
    View all citing articles on Scopus
    View full text