Multimodal spatio-temporal-spectral fusion for deep learning applications in physiological time series processing: A case study in monitoring the depth of anesthesia

doi:10.1016/j.inffus.2021.03.001

Information Fusion

Volume 73, September 2021, Pages 125-143

https://doi.org/10.1016/j.inffus.2021.03.001 Get rights and content

Under a Creative Commons license

open access

Highlights

•
Fusing multi-modal physiological data based on time-frequency ridge.
•
Compressing physiological data directly at the source.
•
Embedding spatial, temporal, spectral characteristic in just a unified representation.
•
Transfer learning from image-based architectures to timeseries classification.
•
Converting physiological timeseries classification into a visual pattern recognition.

Abstract

Physiological signals processing brings challenges including dimensionality (due to the number of channels), heterogeneity (due to the different range of values) and multimodality (due to the different sources). In this regard, the current study intended, first, to use time-frequency ridge mapping in exploring the use of fused information from joint EEG-ECG recordings in tracking the transition between different states of anesthesia. Second, it investigated the effectiveness of pre-trained state-of-the-art deep learning architectures for learning discriminative features in the fused data in order to classify the states during anesthesia. Experimental data from healthy-brain patients undergoing operation (N = 20) were used for this study. Data was recorded from the BrainStatus device with single ECG and 10 EEG channels. The obtained results support the hypothesis that not only can ridge fusion capture temporal-spectral progression patterns across all modalities and channels, but also this simplified interpretation of time-frequency representation accelerates the training process and yet improves significantly the efficiency of deep models. Classification outcomes demonstrates that this fusion could yields a better performance, in terms of 94.14% precision and 0.28 s prediction time, compared to commonly used data-level fusing methods. To conclude, the proposed fusion technique provides the possibility of embedding time-frequency information as well as spatial dependencies over modalities and channels in just a 2D array. This integration technique shows significant benefit in obtaining a more unified and global view of different aspects of physiological data at hand, and yet maintaining the desired performance level in decision making.

Keywords

Dimensionality reduction

Spatial fusion

Temporal fusion

Spectral fusion

Multimodality

Ridge