Abstract
In recent years, deep learning (DL)-based super-resolution techniques for remote sensing images have made significant progress. However, these models have constraints in effectively managing long-range non-local information and reusing features, while also encountering issues such as gradient vanishing and explosion. To overcome these challenges, we propose the Enhanced Hybrid Attention Transformer (EHAT) framework, which is based on the Hybrid Attention Transformer (HAT) network backbone and combines a region-level nonlocal neural network block and a skip fusion network SFN to form a new skip fusion attention group (SFAG). In addition, we form a Multi-attention Block (MAB) by introducing spatial frequency block (SFB) based on fast Fourier convolution. We have conducted extensive experiments on Uc Merced, CLRS and RSSCN7 datasets. The results show that our method improves the PSNR by about 0.2 dB on Uc Merced\(\times \)4.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Wang, P., Bayram, B., Sertel, E.: A comprehensive review on deep learning based remote sensing image super-resolution methods. Earth Sci. Rev. 232, 104110 (2022)
Hudson, R., Hudson, J.W.: The military applications of remote sensing by infrared. Proc. IEEE 63(1), 104–128 (1975)
He, Y., Zhang, T., You, S., Luo, Z., Zhang, X., Zhang, R.: Remote sensing monitoring of mangrove variation in jiulong river estuary of fujian from 1978 to 2018. In: IGARSS 2020-2020 IEEE International Geoscience and Remote Sensing Symposium, pp. 6654–6657. IEEE (2020)
Chen, G., Sui, X., Kamruzzaman, M.: Agricultural remote sensing image cultivated land extraction technology based on deep learning. Technology 9(10) (2019)
Wang, H., Cao, H., Kai, Y., Bai, H., Chen, X., Yang, Y., Xing, L., Zhou, C.: Multi-source remote sensing intelligent characterization technique-based disaster regions detection in high-altitude mountain forest areas. IEEE Geosci. Remote Sens. Lett. 19, 1–5 (2022)
Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: Proceedings of the European conference on computer vision (ECCV), pp. 286–301 (2018)
Dai, T., Cai, J., Zhang, Y., Xia, S.T., Zhang, L.: Second-order attention network for single image super-resolution. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11065–11074 (2019)
Zhang, S., Yuan, Q., Li, J., Sun, J., Zhang, X.: Scene-adaptive remote sensing image super-resolution using a multiscale attention network. IEEE Trans. Geosci. Remote Sens. 58(7), 4764–4779 (2020)
Huang, B., He, B., Wu, L., Guo, Z.: Deep residual dual-attention network for super-resolution reconstruction of remote sensing images. Remote Sens. 13(14), 2784 (2021)
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale (2020). arXiv:2010.11929
Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: Image restoration using swin transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1833–1844 (2021)
Chen, H., et al.: Pre-trained image processing transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12299–12310 (2021)
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Chen, X., Wang, X., Zhou, J., Qiao, Y., Dong, C.: Activating more pixels in image super-resolution transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 22367–22377 (2023)
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803 (2018)
Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 Sept 2014, Proceedings, Part IV 13, pp. 184–199. Springer (2014)
Kim, J., Lee, J.K., Lee, K.M.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1646–1654 (2016)
Goodfellow, I., et al.: Generative adversarial nets. Adv. Neural Inf. Process. Syst. 27 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 136–144 (2017)
Choi, H., Lee, J., Yang, J.: Ngswin: N-gram swin transformer for efficient single image super-resolution (2022)
Zhang, Y., Tian, Y., Kong, Y., Zhong, B., Fu, Y.: Residual dense network for image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2472–2481 (2018)
Wang, J., Wang, B., Wang, X., Zhao, Y., Long, T.: Hybrid attention based u-shaped network for remote sensing image super-resolution. IEEE Trans. Geosci. Remote Sens. (2023)
Wang, J., Lu, Y., Wang, S., Wang, B., Wang, X., Long, T.: Two-stage spatial-frequency joint learning for large-factor remote sensing image super-resolution. IEEE Trans. Geosci. Remote Sens. (2024)
Acknowledgements
This work was supported by the National Key Research and Development Program of China (Grant No.2021YFC3101601).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wang, J., Xie, Z., Du, Y., Song, W. (2025). EHAT:Enhanced Hybrid Attention Transformer for Remote Sensing Image Super-Resolution. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2024. Lecture Notes in Computer Science, vol 15038. Springer, Singapore. https://doi.org/10.1007/978-981-97-8685-5_16
Download citation
DOI: https://doi.org/10.1007/978-981-97-8685-5_16
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-8684-8
Online ISBN: 978-981-97-8685-5
eBook Packages: Computer ScienceComputer Science (R0)