Abstract:
Multimodal learning is essential for understanding interactions between different input domains. However, dealing with various modalities often leads to a high number of ...Show MoreMetadata
Abstract:
Multimodal learning is essential for understanding interactions between different input domains. However, dealing with various modalities often leads to a high number of network parameters and extended training time. To tackle these challenges, a recent approach called "missing modality aware prompting" enhances model robustness with minimal parameters by freezing the transformer-based backbone network and introducing missing modality aware prompts. In this paper, we propose a robust missing modality aware prompting approach with the same parameter numbers as the naive prompts by adding noise. Our experiments demonstrate that robust missing modality aware prompts outperform state-of-the-art missing modality prompt-based learning in various scenarios. Additionally, our ablation study verifies the effectiveness of robust missing modality aware prompts across different signal-to-noise ratios.
Published in: 2023 IEEE International Conference on Visual Communications and Image Processing (VCIP)
Date of Conference: 04-07 December 2023
Date Added to IEEE Xplore: 29 January 2024
ISBN Information: