Abstract:
Hyperspectral (HS) video captures continuous spectral information of objects, enhancing material identification in tracking tasks. It is expected to overcome the inherent...View moreMetadata
Abstract:
Hyperspectral (HS) video captures continuous spectral information of objects, enhancing material identification in tracking tasks. It is expected to overcome the inherent limitations of red-green–blue (RGB) and multimodal tracking, such as finite spectral cues and cumbersome modality alignment. However, HS tracking faces challenges such as data anxiety, bandgaps, and huge volumes. In this study, inspired by prompt learning in language models, we propose the prompting for hyperspectral video tracking (PHTrack) framework. PHTrack learns prompts to adapt foundation models, mitigating data anxiety and enhancing performance and efficiency. First, the modality prompter (MOP) is proposed to capture rich spectral cues and bridge bandgaps for improved model adaptation and knowledge enhancement. In addition, the distillation prompter (DIP) is developed to refine cross-modal features. PHTrack follows feature-level fusion, effectively managing huge volumes compared to traditional decision-level fusion fashions. Extensive experiments validate the proposed framework, offering valuable insights for future research. The code and data will be available at
https://github.com/YZCU/PHTrack
Published in: IEEE Transactions on Geoscience and Remote Sensing ( Volume: 62)