Abstract:
Compared with general optical images, hyperspectral images (HSIs) contain richer spectral information. On one hand, this provides a sufficient basis for ground object rec...Show MoreMetadata
Abstract:
Compared with general optical images, hyperspectral images (HSIs) contain richer spectral information. On one hand, this provides a sufficient basis for ground object recognition. On the other hand, it results in the intermingling of spatial and spectral information. In order to make better use of the rich spatial and spectral information in HSIs, we resort to vision transformer (ViT). To be specific, we propose the cross spatial–spectral dense transformer (CS2DT) for spatial–spectral feature extracting and feature fusing. For feature extraction, CS2DT employs the adaptive dense encoder (ADE) module, which enables the extraction of multiscale semantic information. During the features fusion stage, we use the cross spatial–spectral attention (CS2A) module based on the cross-attention (CA) operation to better integrate spatial and spectral features. We evaluate the classification performance of the proposed CS2DT on three well-known datasets by conducting extensive experiments. Experimental results demonstrate that CS2DT can achieve higher accuracy and higher stability when compared with the state-of-the-art (SOTA) methods. The source code will be made available at https://github.com/shouhengx/CS2DT.
Published in: IEEE Geoscience and Remote Sensing Letters ( Volume: 20)