ABSTRACT
Fusing multisource remote sensing data is an important approach to improve pixel-wise classification performance. Generally, the richer the information input into the model, the more diverse the knowledge it can learn, thereby improving classification performance. However, existing fusion methods are usually only applicable to two modal inputs and find it difficult to balance the consistency and diversity of multisource features. In this paper, we propose a novel classification network named multimodal equilateral absorption network (MEANet) which can fuse multiple kinds of remote sensing images. Specifically, three modal features are firstly extracted by a three-branch CNN. Then, the cross-modal interacting module (CIM) is utilized to realize feature fusion on the multimodal features. Thirdly, the improved triplet loss is designed to make a tradeoff between feature diversity and consistency, thus making the network acquire multisource information more efficiently. Finally, pixel-wise summation and a fully connected (FC) layer are utilized to obtain the final classification results. Experiments on two datasets show that the proposed MEANet has a competitive classification performance compared to several state-of-the-art methods.
- Camps-Valls, G., Tuia, D., Bruzzone, L., Benediktsson, J.A.: Advances in hyperspectral image classification: Earth monitoring with statistical learning methods. IEEE signal processing magazine 31(1), 45–54 (2013)Google Scholar
- Zhao, X., Liu, K., Gao, K., Li, W.: Hyperspectral time-series target detection based on spectral perception and spatial-temporal tensor decomposition. IEEE Trans Geosci Remote Sens. 61, 1–12 (2023). https://doi.org/10.1109/TGRS.2023.3307071Google ScholarCross Ref
- Khodadadzadeh, M., Li, J., Prasad, S., Plaza, A.: Fusion of hyperspectral and lidar remote sensing data using multiple feature learning. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 8(6), 2971–2983 (2015)Google ScholarCross Ref
- Liao, W., Bellens, R., Piˇzurica, A., Gautama, S., Philips, W.: Combining feature fusion and decision fusion for classification of hyperspectral and lidar data. In: 2014 IEEE Geoscience and Remote Sensing Symposium. pp. 1241–1244. IEEE (2014)Google Scholar
- Dabbiru, L., Samiappan, S., Nobrega, R.A., Aanstoos, J.A., Younan, N.H., Moorhead, R.J.: Fusion of synthetic aperture radar and hyperspectral imagery to detect impacts of oil spill in gulf of mexico. In: 2015 IEEE international geoscience and remote sensing symposium (IGARSS). pp. 1901–1904. IEEE (2015)Google Scholar
- Yang, J., Ren, G., Ma, Y., Fan, Y.: Coastal wetland classification based on high resolution sar and optical image fusion. In: 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS). pp. 886–889. IEEE (2016)Google Scholar
- Ghamisi, P., H ̈ofle, B., Zhu, X.X.: Hyperspectral and lidar data fusion using extinction profiles and deep convolutional neural network. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 10(6), 3011–3024 (2016)Google ScholarCross Ref
- Chen, Y., Li, C., Ghamisi, P., Jia, X., Gu, Y.: Deep fusion of remote sensing data for accurate classification. IEEE Geoscience and Remote Sensing Letters 14(8), 1253–1257 (2017)Google ScholarCross Ref
- Xu, X., Li, W., Ran, Q., Du, Q., Gao, L., Zhang, B.: Multisource remote sensing data classification based on convolutional neural network. IEEE Transactions on Geoscience and Remote Sensing 56(2), 937–949 (2017)Google ScholarCross Ref
- Li, H., Ghamisi, P., Soergel, U., Zhu, X.X.: Hyperspectral and lidar fusion using deep three-stream convolutional neural networks. Remote Sensing 10(10), 1649 (2018)Google ScholarCross Ref
- Zhang, M., Li, W., Du, Q., Gao, L., Zhang, B.: Feature extraction for classification of hyperspectral and lidar data using patch-to-patch cnn. IEEE transactions on cybernetics 50(1), 100–111 (2018)Google Scholar
- Hong, D., Yokoya, N., Xia, G.S., Chanussot, J., Zhu, X.X.: X-modalnet: A semi-supervised deep cross-modal network for classification of remote sensing data. ISPRS Journal of Photogrammetry and Remote Sensing 167, 12–23 (2020)Google ScholarCross Ref
- Zhao, X., Tao, R., Li, W., Li, H.C., Du, Q., Liao, W., Philips, W.: Joint classification of hyperspectral and lidar data using hierarchical random walk and deep cnn architecture. IEEE Transactions on Geoscience and Remote Sensing 58(10), 7355–7370 (2020)Google ScholarCross Ref
- Wang, Z., Li, C., Zheng, A., He, R., Tang, J.: Interact, embed, and enlarge: boosting modality-specific representations for multi-modal person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 36, pp. 2633–2641 (2022)Google ScholarCross Ref
- Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 815–823 (2015)Google ScholarCross Ref
- Wang, Y., Huang, W., Sun, F., Xu, T., Rong, Y., Huang, J.: Deep multimodal fusion by channel exchanging. Advances in neural information processing systems 33, 4835–4845 (2020)Google Scholar
- Wang, L., Liu, B., Xu, S., Pan, J., Zhou, Q.: Ai auxiliary labeling and classification of breast ultrasound images. Journal of Image and Graphics 9(2), 45–49 (2021)Google ScholarCross Ref
Index Terms
- Classification of Multisource Remote Sensing Images Using Multimodal Equilateral Absorption Network
Recommendations
Multi-deep features fusion for high-resolution remote sensing image scene classification
AbstractIn view of the small number of categories and the relatively little amount of labeled data, it is challenging to apply the fusion of deep convolution features directly to remote sensing images. To address this issue, we propose a pyramid multi-...
Graph-based multimodal fusion with metric learning for multimodal classification
Highlights- Multimodal classification results surpasses the state-of-the-art methods.
- Works ...
AbstractIn this paper, a graph-based, supervised classification method for multimodal data is introduced. It can be applied on data of any type consisting of any number of modalities and can also be used for the classification of datasets with ...
Multi-Stage Fusion and Multi-Source Attention Network for Multi-Modal Remote Sensing Image Segmentation
With the rapid development of sensor technology, lots of remote sensing data have been collected. It effectively obtains good semantic segmentation performance by extracting feature maps based on multi-modal remote sensing images since extra modal data ...
Comments