Abstract
Due to light absorption and scattering, underwater images often suffer from color distortion, low contrast, and blurred details, seriously affects the effectiveness of advanced computer vision tasks. To address these degradation issues, this paper proposes an innovative underwater image enhancement algorithm, Deep Pooling and Multi-Scale Fusion Transformer (DPMFformer). The algorithm is composed of four key modules: the Dual-Balanced Multiscale Fusion Module (DBMF), the Deep Pooling Self-Attention Transformer (DPST), the Wavelet Sampling (WS), and the Global Spatial Feature Self-Attention Transformer (GSFAT). The DBMF module employs trainable color modules to simulate the grey-scale world theory, achieving inter-channel color balance. The DPST module enhances the network’s ability to extract information from feature regions through a deep-pooling layer and spatial attention mechanism. The WS module utilizes Harr wavelet sampling instead of conventional up- and down-sampling, preserving low-frequency information while improving the up-sampling outcome. The GSFAT module combines Swin Transformer (SwinT) and Position Embedding Cascading Transformer (PCET), enhancing the extraction of global information through position embedding and a sliding window self-attention mechanism, thereby improving the attention on the degraded regions of the image. Experimental results show that the proposed DPMFfomer is superior to existing underwater image enhancement methods.











Similar content being viewed by others
Data availability
No datasets were generated or analysed during the current study.
References
Abdul Ghani AS, Mat Isa NA (2017) Automatic system for improving underwater image contrast and color through recursive adaptive histogram modification. Comput Electron Agric 141:181–195. https://doi.org/10.1016/j.compag.2017.07.021
Akkaynak D, Treibitz T (2018) A revised underwater image formation mode. IEEE/CVF Conference on Computer Vision and Pattern Recognition 6723–6732. https://doi.org/10.1109/CVPR.2018.00703
Al-Haj A (2007) Combined DWT-DCT digital image watermarking. J Comput Sci 3(9):740–746. https://doi.org/10.3844/jcssp.2007.740.746
Ancuti C, Ancuti CO, Haber T, Bekaert P (2012) Enhancing underwater images and videos by fusion. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI: IEEE, pp 81–88. https://doi.org/10.1109/CVPR.2012.6247661
Boudiaf A et al (2022) Underwater image Enhancement using pre-trained transformer. In: Sclaroff S, Distante C, Leo M, Farinella GM, Tombari F (eds) in Image Analysis and Processing – ICIAP 2022. Lecture Notes in Computer Science, vol 13233. Springer International Publishing, vol 13233. Cham, pp 480–488. https://doi.org/10.1007/978-3-031-06433-3_41
Chen Q, Zhang Z, Li G (2022) Underwater image enhancement based on color balance and multi-scale fusion. IEEE Photonics J 14(6):1–10. https://doi.org/10.1109/JPHOT.2022.3227159
Cheng J, Wu Z, Wang S, Demonceaux C, Jiang Q (2023) Bidirectional collaborative mentoring network for marine organism detection and beyond. IEEE Trans Circuits Syst Video Technol 33(11):6595–6608. https://doi.org/10.1109/TCSVT.2023.3264442
Chiang JY, Chen Y-C (2012) Underwater image enhancement by wavelength compensation and dehazing. IEEE Trans Image Process 21(4):1756–1769. https://doi.org/10.1109/TIP.2011.2179666
Cui Y, Tao Y, Bing Z et al (2023) Selective frequency network for image restoration. The Eleventh International Conference on Learning Representations
Cui Y, Ren W, Knoll A (2024) Omni-Kernel network for image restoration. Proc AAAI Conf Artif Intell 38(2):1426–1434. https://doi.org/10.1609/aaai.v38i2.27907
Cui Y, Ren W, Cao X, Knoll A (2024) Image restoration via frequency selection. IEEE Trans Pattern Anal Mach Intell 46(2):1093–1108. https://doi.org/10.1109/TPAMI.2023.3330416
Cui Y, Ren W, Cao X, Knoll A (2024) Revitalizing convolutional network for image restoration. IEEE Trans Pattern Anal Mach Intell 1–16:1. https://doi.org/10.1109/TPAMI.2024.3419007
Cong R et al (2023) Physical model-guided underwater image Enhancement using GAN with Dual-discriminators. IEEE Trans Image Process 32:4472–4485. https://doi.org/10.1109/TIP.2023.3286263
Dai H, Zheng Z, Wang W (2017) A new fractional wavelet transform. Commun Nonlinear Sci Numer Simul 44:19–36. https://doi.org/10.1016/j.cnsns.2016.06.034
Drews PLJ, Nascimento ER, Botelho SSC, Montenegro Campos MF (2016) Underwater depth estimation and image restoration based on single images. IEEE Comput Graph Appl 36(2):24–35. https://doi.org/10.1109/MCG.2016.26
Fabbri C, Islam MJ, Sattar J (2018) Enhancing underwater imagery using generative adversarial networks. IEEE International Conference on Robotics and Automation (ICRA) 7159–71654. https://doi.org/10.1109/ICRA.2018.8460552
Fu Z et al (2022) Unsupervised underwater image restoration: from a homology perspective. Proc AAAI Conf Artif Intell 36(1):643–651. https://doi.org/10.1609/aaai.v36i1.19944
Fu X, Zhuang P, Huang Y, Liao Y, Zhang X-P, Ding X (2014) A retinex-based enhancing approach for single underwater image. In: 2014 IEEE International Conference on Image Processing (ICIP), Paris, France: IEEE, pp. 4572–4576. https://doi.org/10.1109/ICIP.2014.7025927
Fu Z, Lin X, Wang W, Huang Y, Ding X (2022) Underwater image enhancement via learning water type desensitized representations. In: ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore: IEEE, pp 2764–2768. https://doi.org/10.1109/ICASSP43922.2022.9747758
Garg D, Garg NK, Kumar M (2018) Underwater image enhancement using blending of CLAHE and percentile methodologies. Multimed Tools Appl 77(20):26545–26561. https://doi.org/10.1007/s11042-018-5878-8
Gao Z, Yang J, Zhang L, Jiang F, Jiao X (2024) Transformer embedded generative adversarial network for underwater image enhancement. Cogn Comput 16(1):191–214. https://doi.org/10.1007/s12559-023-10197-6
Galdran A, Pardo D, Picón A, Alvarez-Gila A (2015) Automatic Red-Channel underwater image restoration. J Vis Commun Image Represent 26:132–145. https://doi.org/10.1016/j.jvcir.2014.11.006
Guo C et al (2023) Underwater ranker: learn which is better and how to be better. Proc AAAI Conf Artif Intell 37(1):702–709. https://doi.org/10.1609/aaai.v37i1.25147
Huang Z, Li J, Hua Z, Fan L (2022) Underwater image enhancement via adaptive group attention-based multiscale cascade transformer. IEEE Trans Instrum Meas 71:1–18. https://doi.org/10.1109/TIM.2022.3189630
Islam MJ, Xia Y, Sattar J (2020) Fast underwater image enhancement for improved visual perception. IEEE Robot Autom Lett 5(2):3227–3234. https://doi.org/10.1109/LRA.2020.2974710
Ji X, Wang X, Hao L-Y, Cai C-T (2024) CFENet: cost-effective underwater image enhancement network via cascaded feature extraction. Eng Appl Artif Intell 133:108561. https://doi.org/10.1016/j.engappai.2024.108561
Jiang K, Wang Q, An Z, Wang Z, Zhang C, Lin C-W (2024) Mutual Retinex: combining transformer and CNN for image enhancement. IEEE Trans Emerg Top Comput Intell 8(3):2240–2252. https://doi.org/10.1109/TETCI.2024.3369321
Li C, Guo J (2015) Underwater image enhancement by dehazing and color correction. J Electron Imaging 24(3):033023. https://doi.org/10.1117/1.JEI.24.3.033023
Li C, Anwar S, Porikli F (2020) Underwater scene prior inspired deep underwater image and video enhancement. Pattern Recognit 98:107038. https://doi.org/10.1016/j.patcog.2019.107038
Li C, Guo J, Guo C (2018) Emerging from water: underwater image color correction based on weakly supervised color transfer. IEEE Signal Process Lett 25(3):323–327. https://doi.org/10.1109/LSP.2018.2792050
Li C-Y, Guo J-C, Cong R-M, Pang Y-W, Wang B (2016) Underwater image enhancement by dehazing with minimum information loss and histogram distribution prior. EEE Trans Image Process 25(12):5664–5677. https://doi.org/10.1109/TIP.2016.2612882
Li C, Quo J, Pang Y, Chen S, Wang J (2016) Single underwater image restoration by blue-green channels dehazing and red channel correction. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai: IEEE, pp 1731–1735. https://doi.org/10.1109/ICASSP.2016.7471973
Li J, Skinner KA, Eustice RM, Johnson-Roberson M (2017) WaterGAN: unsupervised generative network to enable real-time color correction of monocular underwater images. IEEE Robot Autom Lett 1–1:1. https://doi.org/10.1109/LRA.2017.2730363
Li C, Guo J, Guo C, Cong R, Gong J (2017) A hybrid method for underwater image correction. Pattern Recognit Lett 94:62–67. https://doi.org/10.1016/j.patrec.2017.05.023
Liu Z et al (2021) Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. arXiv. https://doi.org/10.48550/ARXIV.2103.14030
Lu H, Li Y, Uemura T, Kim H, Serikawa S (2018) Low illumination underwater light field images reconstruction using deep convolutional neural networks. Future Gener Comput Syst 82:142–148. https://doi.org/10.1016/j.future.2018.01.001
Liang Z, Ding X, Wang Y, Yan X, Fu X (2022) GUDCP: Generalization of underwater dark channel prior for underwater image restoration. IEEE Trans Circuits Syst Video Technol 32(7):4879–4884. https://doi.org/10.1109/TCSVT.2021.3114230
Liu Q, Zhang Q, Liu W, Chen W, Liu X, Wang X (2023) WSDS-GAN: a weak-strong dual supervised learning method for underwater image enhancement. Pattern Recognit 143:109774. https://doi.org/10.1016/j.patcog.2023.109774
Mohd Azmi KZ, Abdul Ghani AS, Md Yusof Z, Ibrahim Z (2019) Natural-based underwater image color enhancement through fusion of swarm-intelligence algorithm. Appl Soft Comput 85:105810. https://doi.org/10.1016/j.asoc.2019.105810
Park J, Han DK, Ko H (2019) Adaptive weighted multi-discriminator cycleGAN for underwater image enhancement. J Mar Sci Eng 7(7):200. https://doi.org/10.3390/jmse7070200
Perez J, Attanasio AC, Nechyporenko N, Sanz PJ (2017) A Deep Learning Approach for Underwater Image Enhancement. In: Ferrández Vicente JM, Álvarez-Sánchez JR, De La Paz López F, Toledo Moreo J, Adeli H (eds) Biomedical Applications Based on Natural and Artificial Computing, vol. 10338,., in Lecture Notes in Computer Science, vol 10338. Cham: Springer International Publishing, pp 183–192. https://doi.org/10.1007/978-3-319-59773-7_19
Peng Y-T, Cosman PC (2017) Underwater image restoration based on image blurriness and light absorption. IEEE Trans Image Process 26(4):1579–1594. https://doi.org/10.1109/TIP.2017.2663846
Peng Y-T, Cao K, Cosman PC (2018) Generalization of the dark channel prior for single image restoration. IEEE Trans Image Process 27(6):2856–2868. https://doi.org/10.1109/TIP.2018.2813092
Pei S-C, Chen C-Y (2022) Underwater images enhancement by revised underwater images formation model. IEEE Access 10:108817–108831. https://doi.org/10.1109/ACCESS.2022.3213340
Peng L, Zhu C, Bian L (2023) U-Shape transformer for underwater image enhancement. IEEE Trans Image Process 32:3066–3079. https://doi.org/10.1109/TIP.2023.3276332
Rajesh V, Radhika S, Vishnu S (2023) Comparing the performance measures of underwater image enhancement through improved CNN with Gaussian and Kalman filter method. In: 2023 International Conference on System, Computation, Automation and Networking (ICSCAN), PUDUCHERRY, India: IEEE, pp 1–6. https://doi.org/10.1109/ICSCAN58655.2023.10395835
Ronneberger O, Fischer P, Brox T (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab N, Hornegger J, Wells WM, Frangi AF (eds) In Medical Image Computing and Computer-assisted intervention – MICCAI 2015. Lecture Notes in Computer Science, vol 9351. Springer International Publishing, vol. 9351. Cham, pp 234–241. https://doi.org/10.1007/978-3-319-24574-4_28.
Shen Z, Xu H, Luo T, Song Y, He Z (2023) UDAformer: underwater image enhancement based on dual attention transformer. Comput Graph 111:77–88. https://doi.org/10.1016/j.cag.2023.01.009
Tang Y, Kawasaki H, Iwaguchi T (2023) Underwater image enhancement by transformer-based diffusion model with non-uniform sampling for skip strategy. Proceedings of the 31st ACM International Conference on Multimedia 5419–5427. https://doi.org/10.1145/3581783.3612378
Ummar M, Dharejo FA, Alawode B, Mahbub T, Piran MJ, Javed S (2023) Window-based transformer generative adversarial network for autonomous underwater image enhancement. Eng Appl Artif Intell 126:107069. https://doi.org/10.1016/j.engappai.2023.107069
Xing Z, Xu H, Jiang G, Yu M, Luo T, Chen Y (2024) Vision graph convolutional network for underwater image enhancement. Knowl -Based Syst 299:112048. https://doi.org/10.1016/j.knosys.2024.112048
Yang M, Hu J, Li C, Rohde G, Du Y, Hu K (2019) An In-Depth survey of underwater image enhancement and restoration. IEEE Access 7:123638–123657. https://doi.org/10.1109/ACCESS.2019.2932611
Yuan J, Cao W, Cai Z, Su B (2021) An underwater image vision enhancement algorithm based on contour bougie morphology. IEEE Trans Geosci Remote Sens 59(10):8117–8128. https://doi.org/10.1109/TGRS.2020.3033407
Zheng Y, Chen W, Lin R, Zhao T, Callet PL (2022) UIF: an objective quality assessment for underwater image enhancement. IEEE Trans Image Process 31:5456–5468. https://doi.org/10.1109/TIP.2022.3196815
Zhang W, Zhuang P, Sun H-H, Li G, Kwong S, Li C (2022) Underwater image enhancement via minimal color loss and locally adaptive contrast enhancement. IEEE Trans Image Process 31:3997–4010. https://doi.org/10.1109/TIP.2022.3177129
Zhang S, Wang T, Dong J, Yu H (2017) Underwater image enhancement via extended multi-scale Retinex. Neurocomputing 245:1–9. https://doi.org/10.1016/j.neucom.2017.03.029
Zhang W et al (2024) Underwater image enhancement via weighted wavelet visual perception fusion. IEEE Trans Circuits Syst Video Technol 34(4):2469–2483. https://doi.org/10.1109/TCSVT.2023.3299314
Zhao H, Jiang L, Jia J, Torr P, Koltun V (2020) Point Transformer, arXiv. https://doi.org/10.48550/ARXIV.2012.09164
Zhang W, Wang Y, Li C (2022) Underwater image enhancement by attenuated color channel correction and detail preserved contrast enhancement. IEEE J Ocean Eng 47(3):718–735. https://doi.org/10.1109/JOE.2022.3140563
Zhang W, Jin S, Zhuang P, Liang Z, Li C (2023) Underwater image enhancement via piecewise color correction and dual prior optimized contrast enhancement. IEEE Signal Process Lett 30:229–233. https://doi.org/10.1109/LSP.2023.3255005
Funding
This work was supported by Special projects in universities' key fields of Guangdong Province (2023ZDZX3017), 2022 Tertiary Education Scientific research project of Guangzhou Municipal Education Bureau (202234607), the National Natural Science Foundation of China (52101358). The General Universities' Key Scientific Research Platform Project of Guangdong Province(2023KSYS009).
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Dan Xiang, Wenlei Yang, Zebin Zhou, Jinwen Zhang, Jianxin Li, Jing Ling and Jian Ouyang. The first draft of the manuscript was written by Wenlei Yang and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Communicated by: Hassan Babaie
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xiang, D., Yang, W., Zhou, Z. et al. DPMFformer: an underwater image enhancement network based on deep pooling and multi-scale fusion transformer. Earth Sci Inform 18, 61 (2025). https://doi.org/10.1007/s12145-024-01573-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12145-024-01573-3