As the phase of the UWB signal reflected from the chest wall changes corresponding to the physiological motions, we are able to extract the respiration and heartbeat. Given the center frequency of UWB signal 8.75
\(\text{GHz}\) in our design, a 0.2- to 0.5-
mm displacement caused by a heartbeat [
68] translates to a
\(2.1^{\circ }\)–
\(5.3^{\circ }\) change in phase, while a 4- to 12-
mm displacement caused by respiration [
19] translates to a
\(42.0^{\circ }\)–
\(126.0^{\circ }\) change in phase [
38]. The heartbeat signal is orders of magnitude weaker than and totally buried by respiration signal in time domain. The difference in the typical frequency ranges between respiration (
\(\sim\)6–18
bpm) and heartbeat (
\(\sim\)50–150
bpm) allows them to be extracted separately. Figure
6 shows that with fine-tuned bandpass filters applied upon the phase signal, the respiration and heartbeat can be easily recognized from the FFT spectrum. While the example case looks straightforward, to robustly measure vital signs is still an open challenge. We will introduce estimation methods for both respiration and heart rates.
4.3.2 Heart Rate Estimation.
Extracting the heart rate is more challenging due to its much smaller RCS and displacements, thus, much weaker magnitudes in both temporal and spectral domains. As explained in Section
2.4, harmonics and intermodulation from respiration can easily dominate the heartbeat signal and their patterns are dynamic.
To robustly measure heart rate, we propose a PWF that (1) incorporates four heart rate estimators, each suitable to one of four identified temporal and spectral patterns; (2) adaptively combines heart rate candidates generated by the estimators with the quantified cumulative evidence of each pattern; and (3) leverages limits in heart rate temporal changes to smooth continuous measures.
Heartbeat Signal Extraction. In this step, we filter noises and respiration signals, and enhance the heartbeat signal for estimation. While the heartbeat signal presents periodic changes, the noise behaves randomly and can be modeled as Gaussian. We use auto-correlation [
69] to zero out the noise and enhance the periodic pattern of heartbeats. We observe that because of its higher frequency, heartbeats cause larger changes among adjacent sampling points than respiration. We use the second-order difference to make the heartbeat more prominent.
Then, we use the Discrete Wavelet Transform (DWT) as the filter bank [
79] to extract heartbeat signals because the DWT can retain the inherently irregular shape of the vital signals whereas the conventional filters (e.g., Butterworth filter [
34]) would smooth the shape and result in loss of information for temporal analysis. We progressively split the signal into
approximation coefficients (from the low-pass filter) and
detail coefficients (from the high-pass filter) with the previously decomposed coefficients and reconstruct the signal with the coefficients in the interested frequency range (0.625–5 Hz, which covers both fundamental and second-order harmonics). With
L iterations (corresponding to
L scales), an approximation coefficient
\(\gamma ^{(L)}\) and a sequence of detail coefficients
\(\upsilon ^{(1)}, \upsilon ^{(2)}, \ldots , \upsilon ^{(L)}\) are calculated in (
13).
where
\(\varphi\) denotes the scaling function and
\(\psi\) the wavelet. The heartbeat signal can be reconstructed using the inverse DWT:
In VitalHub, we select the Daubechies(db4) wavelet as the mother wavelet [
22] and split the signal into 4 levels. The detail coefficients
\(\upsilon ^{(3)} + \upsilon ^{(4)}\) (ranging from 0.625 Hz to 2.5 Hz) are used to reconstruct the heartbeat signal. The coefficients
\(\upsilon ^{(4)}+\gamma ^{(4)}\) (ranging from 1.25 Hz to 5 Hz) are used to reconstruct the second-order harmonic component of the heartbeat signal.
Ensemble of Heart Rate Estimators. Based on manual examination of over 6,000 data samples, we identify four typical temporal/spectral patterns (present in \(98.55\%\) of the data) and identify a suitable estimator of the fundamental heart rate based on each domain pattern, including (1) zero-crossing (ZC), (2) peak interval (PK), (3) local maximum detection in the spectrum of the heart rate range (LMD), and (4) spectral peak detection in the range of the heartbeat signals’ second-order harmonics (SOH).
The first two handles two temporal patterns. ZC estimates the heart rate by counting the number of zero-crossings in a time window, dealing with a periodic pattern of temporal changes between negative and positive values. Higher-order harmonics of respiration may cause more negative-to-positive transitions, thus, a falsely higher heart rate. The PK measures the average interval between adjacent local maxima in a time window, thus, the heart rate. It is relatively immune to signals of larger energy but sensitive to high-frequency jitters.
The latter two handle two spectral patterns. When the fundamental spectral peak of heartbeat has significant energy [
5], LMD detects such high peaks in the heart rate range (50–150 bpm). When higher-order harmonics or intermodulation of respiration has strong energy, they may overwhelm the heart peak in this range. SOH selects spectral peaks in the range of the SOH of the heartbeat (100–300 bpm), then halves them as estimates. We observe that respiration harmonics and intermodulation have much weaker energy in this range [
64]. Due to partial overlap with the heartbeat fundamental frequency range, respiration may still produce significant peaks on occasion, resulting in erroneous heart rate estimation.
Using a sliding window, we produce a heart rate candidate set \(C_{t}\) at time t, including \(C_{t}^F\), 2 estimates from ZC, PK, and 3 largest peaks from LMD, and \(C_{t}^S\), 3 largest peaks from SOH. Unless explicitly stated, a candidate \(c_{t}^m\) is chosen from the combined set \(C_{t}=C_{t}^F \cup C_{t}^S\).
Probabilistic Heart Rate Tracking. We formulate the continuous heart rate estimation as tracking the “trend” of changes, with the state update equation as follows:
where
\(x_{t-1}\) is the state (i.e., heart rate) we have estimated at time
\(t-1\),
\(\hat{x}_{t}\) is the heart rate predicted at time
t,
\(\triangle t\) is the estimation interval (set to 1 second in our configuration), and
\(\varepsilon _p \sim \mathcal {N}(0, \sigma _p^2)\) is the process noise. Because errors accumulate over time, the predictions must be calibrated using evidence from observations.
The four temporal/spectral patterns are present most of the time (\(\gt\)98%); thus, the heart rate candidate set \(C_{t}\) very likely includes the correct one. The key is to determine which one. We quantify the evidence of each candidate \(c_t^m\) to determine its weight and calibrate predictions.
•
Respiration Harmonics. Assume that the fundamental respiration frequency is
\(f^r_t\). Then, its harmonics are represented as
\(H^r_t=\lbrace f^r_t, 2f^r_t, \ldots ,Nf^r_t\rbrace\), where
N is empirically limited at 5 because those beyond the 5th are negligible [
64]. The closer a candidate is to any respiration harmonic, the less likely it is true, which can be formulated in the following weight:
where
\(n \in \lbrace 1,2, \ldots ,N\rbrace\),
\(g_{r}(\cdot) \sim N(0, \sigma _r^2)\) is a Gaussian distribution and
\(\sigma _r\) is empirically set to 2.
•
Heartbeat Harmonics. Heartbeat signal also has harmonics, while random noise may not. Thus, the existence of high-order harmonics can be used as evidence of the heartbeat fundamental frequency
\(f_h\). As the heartbeat signal is relatively weak, we consider only its SOH. This weight can be calculated as follows:
where
\(c_t^m \in C_{t}^F\),
\(c_t^n \in C_{t}^S\),
\(g_{h}(\cdot) \sim N(0, \sigma _h^2)\) is another Gaussian and
\(\sigma _h\) is empirically set to 2.
•
Peak Prominence. We observe that real peaks are usually “sharp” (i.e., higher prominence), even though the amplitude may be small. While we have estimations from both the time domain (i.e., ZC, PK) and the frequency domain (i.e., LMD, SOH), we use the prominence of the spectral peaks of the heartbeat signal reconstructed according to (
14) at the corresponding (estimated) frequencies to regulate their weights, because the spectral pattern (i.e., the distribution of the peak prominence) is resilient to noise and can serve as a reliable indicator for selecting the vital sign candidates estimated from either temporal or spectral methods. We use an exponential distribution to represent this weight:
where
\(p(c_t^m)\) is the peak prominence that quantifies how much the candidate
\(c_t^m\) peak stands out due to its height and location relative to other nearby peaks, and the scale factor
\(\alpha\) is empirically set to 1.
•
Temporal locality. The heart rate is not likely to change abruptly in a short time (e.g., 1 s), and the next heart rate is usually close to the current one. Therefore, we quantify how close a candidate is to the previous estimation as:
where
\(g_{l}(\cdot) \sim N(0, \sigma _l^2)\) is another Gaussian.
\(\sigma _l\) is the variance of heart rate trend.
We define the likelihood of a candidate to be the heart rate as the cumulative evidence in a product form:
The normalized weight for a candidate is expressed as
Then, we take the weighted average of all of the candidates as a new measurement:
We observe that the error of the weighted measurement can be considered zero-mean Gaussian (using Kolmogorov-Smirnov statistic found at 0.036, less than 0.05, the threshold at which two distributions are considered the same [
25]). Therefore, we apply the Kalman Filter to iteratively repeat the following steps to update the heart rate at discrete timesteps upon each new candidate set:
where
\(K_t\) is the Kalman Gain,
\(\sigma _M^{2}\) and
\(\sigma _t^{2}\) are the variances of measurement noise (from
\(\bar{c}_{t}\)) and process noise initialized with
\(\sigma _p^{2}\).