skip to main content
10.1145/3581783.3613815acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

CTCP: Cross Transformer and CNN for Pansharpening

Published: 27 October 2023 Publication History

Abstract

Pansharpening is to fuse a high-resolution panchromatic (PAN) image with a low-resolution multispectral (LRMS) image to obtain an enhanced LRMS image with high spectral and spatial resolution. The current Transformer-based pansharpening methods neglect the interaction between the extracted long- and short-range features, resulting in spectral and spatial distortion in the fusion results. To address this issue, a novel cross Transformer and convolutional neural network (CNN) for pansharpening (CTCP) is proposed to achieve better fusion results by designing a cross mechanism, which can enhance the interaction between long- and short-range features. First, a dual branch feature extraction module (DBFEM) is constructed to extract the features from the LRMS and PAN images, respectively, reducing the aliasing of the two image features. In the DBFEM, to improve the feature representation ability of the network, a cross long-short-range feature module (CLSFM) is designed by combining the feature learning capabilities of Transformer and CNN via the cross mechanism, which achieves the integration of long-short-range features. Then, to improve the ability of spectral feature representation, a spectral feature enhancement fusion module (SFEFM) based on a frequency channel attention is constructed to realize feature fusion. Finally, the shallow features from the PAN image are reused to provide detail features, which are integrated with the fused features to obtain the final pansharpened results. To the best of our knowledge, this is the first attempt to introduce the cross mechanism between Transformer and CNN in pansharpening field. Numerous experiments show that our CTCP outperforms some state-of-the-art (SOTA) approaches both subjectively and objectively. The source code will be released at https://github.com/zhsu99/CTCP.

References

[1]
Zhiyong Lv, Tongfei Liu, Jon Atli Benediktsson, and Nicola Falco. 2022. Land Cover Change Detection Techniques: Very-High-Resolution Optical Images: A Review. IEEE Geoscience and Remote Sensing Magazine 10 (2022), 44--63.
[2]
Gongjie Zhang, Shijian Lu, and Wei Zhang. 2019. CAD-Net: A Context-Aware Detection Network for Objects in Remote Sensing Imagery. IEEE Transactions on Geoscience and Remote Sensing 57 (2019), 10015--10024.
[3]
Guoqing Cui, Zhiyong Lv, Guangfei Li, Jon Atli Benediktsson, and Yudong Lu. 2018. Refining Land Cover Classification Maps Based on Dual-Adaptive Majority Voting Strategy for Very High Resolution Remote Sensing Images. Remote Sensing 10, 8, (2018).
[4]
Hangyuan Lu, Yong Yang, Shuying Huang, Wei Tu, and Weiguo Wan. 2022. A Unified Pansharpening Model Based on Band-Adaptive Gradient and Detail Correction. IEEE Transactions on Image Process 31 (2022), 918--933.
[5]
Giuseppe Masi, Davide Cozzolino, Luisa Verdoliva, and Giuseppe Scarpa. 2016.Pansharpening by Convolutional Neural Networks. Remote Sensing 8, 7 (2016).
[6]
Yancong Wei, Qiangqiang Yuan, Huanfeng Shen, and Liangpei Zhang. 2017. Boosting the Accuracy of Multispectral Image Pansharpening by Learning a Deep Residual Network. IEEE Geoscience and Remote Sensing Letters 14 (2017), 1795--1799.
[7]
Qiangqiang Yuan, Yancong Wei, Xiangchao Meng, Huanfeng Shen, and Liangpei Zhang. 2018. A Multiscale and Multidepth Convolutional Neural Network for Remote Sensing Imagery Pan-sharpening. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 11, 3 (2018), 978--989.
[8]
Liangjian Deng, Gemine Vivone, Cheng Jin, and Jocelyn Chanussot. 2021. Detail Injection-based Deep Convolutional Neural Networks for Pansharpening. IEEE Transactions on Geoscience and Remote Sensing 59 (2021), 6995--7010.
[9]
Furkan Ozcelik, Ugur Alganci, Elif Sertel, and Gozde Unal. 2021. Rethinking CNN based Pansharpening: Guided Colorization of Panchromatic Images Via GANs. IEEE Transactions on Geoscience and Remote Sensing 59 (2021), 3486--3501.
[10]
Wei Tu, Yong Yang, Shuying Huang, Weiguo Wan, Lixin Gan, and Hangyuan Lu. 2022. MMDN: Multi-scale and Multi-distillation Dilated Network for Pansharpening. IEEE Transactions on Geoscience and Remote Sensing 60 (2022), 1--14. Art no. 5410514.
[11]
Yong Yang, Wei Tu, Shuying Huang, Hangyuan Lu, Weiguo Wan, and Lixin Gan. 2022. Dual-stream Convolutional Neural Network with Residual Information Enhancement for Pansharpening. IEEE Transactions on Geoscience and Remote Sensing 60 (2022), 1--16. Art no. 5402416.
[12]
Yong Yang, Zhao Su, Shuying Huang, Weiguo Wan, Wei Tu, and Changjie Chen. 2022. DCNP: Dual-Information Compensation Network for Pansharpening. IEEE Geoscience and Remote Sensing Letters 19 (2022), 1--5. Art no. 5513005.
[13]
Man Zhou, Jie Huang, Yanchi Fang, Xueyang Fu, and Aiping Liu. 2022. Pan-Sharpening with Customized Transformer and Invertible Neural Network. In Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI). 3553--3561.
[14]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia. Polosukhin. 2017. Attention is All You Need. In Proceedings of the Conference on Neural Information Processing Systems (NIPS). 5998--6008.
[15]
Jacob Devlin, Mingwei Chang, Kenton Lee, and Kristina Toutanova. 2019. Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). 4171--4186.
[16]
Tom B. Brown et al. 2020. Language Models Are Few-Shot Learners. In Proceedings of the Conference on Neural Information Processing Systems (NIPS). 1877--1901.
[17]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby, 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proceedings of the International Conference on Learning Representations (ICLR). 1--21.
[18]
Xiangchao Meng, Nan Wang, Feng Shao and Shutao Li. 2022. Vision Transformer for Pansharpening. IEEE Transactions on Geoscience and Remote Sensing 60 (2022), 1--11. Art no. 5409011.
[19]
Man Zhou, Xueyang Fu, Jie Huang, Feng Zhao, Aiping Liu, and Rujing Wang. 2022. Effective Pan-Sharpening With Transformer and Invertible Neural Network. IEEE Transactions on Geoscience and Remote Sensing 60 (2022), 1--15. Art no. 5406815.
[20]
Xunyang Su, Jinjiang Li and Zhen Hua. 2022.Transformer-based Regression Network for Pansharpening Remote Sensing Images. IEEE Transactions on Geoscience and Remote Sensing 60 (2022), 1--23. Art no. 5407423.
[21]
Sijia Li, Qing Guo, and An Li. 2022. Pan-Sharpening Based on CNN+ Pyramid Transformer by Using No-Reference Loss. Remote Sensing 14 (2022).
[22]
Wengang Zhu, Jinjiang Li, Zhiyong An, and Zhen Hua. 2023. Mutiscale Hybrid Attention Transformer for Remote Sensing Image Pansharpening. IEEE Transactions on Geoscience and Remote Sensing 61 (2023), 1--16. Art no. 5400416.
[23]
Max Ehrlich and Larry S Davis. 2019. Deep Residual Learning in the Jpeg Transform Domain. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 3484--3493.
[24]
Kai Xu, Minghai Qin, Fei Sun, Yuhao Wang, Yen-Kuang Chen, and Fengbo Ren. 2020. Learning in the Frequency Domain. In Proceedings of the IEEE/CVF Conferenceon Computer Vision and Pattern Recognition (CVPR). 1740--1749.
[25]
Yunhe Wang, Chang Xu, Chao Xu, and Dacheng Tao. 2018. Packing Convolutional Neural Networks in the Frequency Domain. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 10 (2018), 2495--2510.
[26]
Zequn Qin, Pengyi Zhang, Fei Wu, and Xi Li, 2021. FcaNet: Frequency Channel Attention Networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 763--772.
[27]
Zhiliang Peng, Wei Huang, Shanzhi Gu, Lingxi Xie, Yaowei Wang, Jianbin Jiao, and Qixiang Ye. 2021. Conformer: Local Features Coupling Global Representations for Visual Recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 357--366.
[28]
François Chollet. .2017. Xception: Deep Learning with Depthwise Separable Convolutions. In Proceedings of the IEEE/CVF Conferenceon Computer Vision and Pattern Recognition (CVPR). 1800--1807.
[29]
Bruno Aiazzi, Stefano Baronti, and Massimo Selva. 2007. Improving Component Substitution Pansharpening Through Multivariate Regression of Ms + Pan Data. IEEE Transactions on Geoscience and Remote Sensing 45 (2007), 3230--3239.
[30]
Gemine Vivone, Rocco Restaino, and Jocelyn Chanussot. 2018. Full Scale Regression-based Injection Coefficients for Panchromatic Sharpening. IEEE Transactions on Image Process.27 (2018), 3418--3431.
[31]
Jaehyup Lee, Soomin Seo, and Munchurl Kim. 2021. Sipsa-Net: Shift-invariant Pan Sharpening with Moving Object Alignment for Satellite Imagery. In Proceedings of the IEEE/CVF Conferenceon Computer Vision and Pattern Recognition (CVPR). 10161--10169.
[32]
Man Zhou, Jie Huang, Keyu Yan, Gang Yang, Aiping Liu, Chongyi Li, and Feng Zhao. 2022. Normalization-based Feature Selection and Restitution for Pan-sharpening. In Proceedings of the 30th ACM International Conference on Multimedia (ACM MM). 3365--3374.
[33]
Man Zhou, Jie Huang, Chongyi Li, Hu Yu, Keyu Yan, Naishan Zheng, and Feng Zhao. 2022. Adaptively Learning Low-high Frequency Information Integration for Pan-sharpening. In Proceedings of the 30th ACM International Conference on Multimedia (ACM MM). 3375--3384.
[34]
Jacob Cohen, .1960. A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement. 20 (1960), 37--46.

Cited By

View all
  • (2025)Invertible Attention-Guided Adaptive Convolution and Dual-Domain Transformer for PansharpeningIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing10.1109/JSTARS.2025.353135318(5217-5231)Online publication date: 2025

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '23: Proceedings of the 31st ACM International Conference on Multimedia
October 2023
9913 pages
ISBN:9798400701085
DOI:10.1145/3581783
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. convolution neural network
  2. cross mechanism
  3. pansharpening
  4. transformer

Qualifiers

  • Research-article

Conference

MM '23
Sponsor:
MM '23: The 31st ACM International Conference on Multimedia
October 29 - November 3, 2023
Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)176
  • Downloads (Last 6 weeks)43
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Invertible Attention-Guided Adaptive Convolution and Dual-Domain Transformer for PansharpeningIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing10.1109/JSTARS.2025.353135318(5217-5231)Online publication date: 2025

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media