Loading [MathJax]/extensions/MathMenu.js
MaskFusionNet: A Dual-Stream Fusion Model With Masked Pre-Training Mechanism for rPPG Measurement | IEEE Journals & Magazine | IEEE Xplore

MaskFusionNet: A Dual-Stream Fusion Model With Masked Pre-Training Mechanism for rPPG Measurement


Abstract:

Remote photoplethysmography (rPPG) has considerable significance in areas such as disease diagnosis and emotion analysis. Recent rPPG models have demonstrated excellent p...Show More

Abstract:

Remote photoplethysmography (rPPG) has considerable significance in areas such as disease diagnosis and emotion analysis. Recent rPPG models have demonstrated excellent performance due to their powerful heart rate information extraction capabilities. However, these models often focus on limited regions of interest (ROI) on facial image, which makes them sensitive to interference. If the ROI is affected by muscle movement, lighting variation and noise, the model’s performance would degrade significantly. To address this limitation, we propose a two-stage model called MaskFusionNet. The model includes two stages: 1) During the pre-training stage, the mask-reconstruction mechanism drives MaskFusionNet to learn rPPG information from various facial regions by applying a tube masking strategy. This enhances the model’s ability to resist interference. Based on the periodicity and continuity of the heart rate signal, we also design a novel spatio-temporal reconstruction loss function that focuses on the data’s spatial features and temporal continuity. 2) In the fine-tuning stage, we propose the Multi-Scale Fusion Block (MFB) to combine multi-scale features from the dual-stream network. It allows the model to detect subtle heart rate variations in adjacent frames while minimizing the impact of interference by extracting features within longer segments. The transformer-based MaskFusionNet can extract multi-scale fused heart rate features from a wide range of skin regions while preserving the modeling capability of long-range sequence information. To validate its effectiveness, we extensively evaluate our model on three benchmark datasets (VIPL-HR, COHFACE, and PURE), demonstrating its superior performance in both intra-dataset and cross-dataset testing scenarios.
Page(s): 11521 - 11534
Date of Publication: 03 July 2024

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.