Abstract:
Recently, multimodal remote sensing image (MRSI) classification has attracted increasing attention from researchers. However, the classification of MRSI with limited labe...Show MoreMetadata
Abstract:
Recently, multimodal remote sensing image (MRSI) classification has attracted increasing attention from researchers. However, the classification of MRSI with limited labeled instances is still a challenging task. In this article, a novel self-supervised cross-modal contrastive learning (CMCL) method is proposed for MRSI classification. Joint intramodal contrastive learning (IMCL) and CMCL are used to better mine multimodal feature representations during pretraining, and the IMCL and CMCL objectives are jointly optimized, whereby it encourages the learned representation to be semantically consistent within and between modalities simultaneously. Moreover, a simple but effective hybrid cross-modal fusion module (HCFM) is designed in the fine-tuning stage, which could better compactly integrate complementary information across these modalities for more accurate classification. Extensive experiments are taken on four benchmark datasets (i.e., Houston 2013, Augsburg, Germany; Trento, Italy; and Berlin, Germany), and the results show that the proposed method outperforms state-of-the-art methods.
Published in: IEEE Transactions on Geoscience and Remote Sensing ( Volume: 61)