PixMamba: Leveraging State Space Models in a Dual-Level Architecture for Underwater Image Enhancement

Lin, Wei-Tung; Lin, Yong-Xiang; Chen, Jyun-Wei; Hua, Kai-Lung

doi:10.1007/978-981-96-0911-6_11

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15475))

Included in the following conference series:

Asian Conference on Computer Vision

120 Accesses

Abstract

Underwater Image Enhancement (UIE) is critical for marine research and exploration but hindered by complex color distortions and severe blurring. Recent deep learning-based methods have achieved remarkable results, yet these methods struggle with high computational costs and insufficient global modeling, resulting in locally under- or over-adjusted regions. We present PixMamba, a novel architecture, designed to overcome these challenges by leveraging State Space Models (SSMs) for efficient global dependency modeling. Unlike convolutional neural networks (CNNs) with limited receptive fields and transformer networks with high computational costs, PixMamba efficiently captures global contextual information while maintaining computational efficiency. Our dual-level strategy features the patch-level Efficient Mamba Net (EMNet) for reconstructing enhanced image feature and the pixel-level PixMamba Net (PixNet) to ensure fine-grained feature capturing and global consistency of enhanced image that were previously difficult to obtain. PixMamba achieves state-of-the-art performance across various underwater image datasets and delivers visually superior results. Code is available at https://github.com/weitunglin/pixmamba.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.99; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Underwater image enhancement using lightweight vision transformer

Article 19 February 2024

Learning multiscale pipeline gated fusion for underwater image enhancement

Article 03 March 2023

Region gradient-guided diffusion model for underwater image enhancement

Article 17 January 2025

References

Ancuti, C.O., Ancuti, C., De Vleeschouwer, C., Bekaert, P.: Color balance and fusion for underwater image enhancement. IEEE Trans. Image Process. 27(1), 379–393 (2018)
Article MathSciNet Google Scholar
Berman, D., Levy, D., Avidan, S., Treibitz, T.: Underwater single image color restoration using haze-lines and a new quantitative dataset. IEEE Trans. Pattern Anal. Mach. Intell. 43(8), 2822–2837 (2021)
Google Scholar
Cao, H., Wang, Y., Chen, J., Jiang, D., Zhang, X., Tian, Q., Wang, M.: Swin-unet: Unet-like pure transformer for medical image segmentation. In: European Conference on Computer Vision Workshops (ECCVW). pp. 205–218 (2022)
Google Scholar
Cao, X., Ren, L., Sun, C.: Dynamic target tracking control of autonomous underwater vehicle based on trajectory prediction. IEEE Transactions on Cybernetics 53(3), 1968–1981 (2023)
Article Google Scholar
Chen, S.F., Wen, C.X., Cheng, W.H., Hua, K.L.: Representation and boundary enhancement for action segmentation using transformer. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 5965–5969 (2024)
Google Scholar
Cong, R., Yang, W., Zhang, W., Li, c., Guo, C.L., Huang, Q., Kwong, S.: PUGAN: Physical model-guided underwater image enhancement using GAN with dual-discriminators. IEEE Transactions on Image Processing 32, 4472–4485 (2023)
Google Scholar
Dao, T., Gu, A.: Transformers are SSMs: Generalized models and efficient algorithms through structured state space duality. In: International Conference on Machine Learning (ICML) (2024)
Google Scholar
Fayaz, S., Parah, S.A., Qureshi, G.J., Lloret, J., Ser, J.D., Muhammad, K.: Intelligent underwater object detection and image restoration for autonomous underwater vehicles. IEEE Trans. Veh. Technol. 73(2), 1726–1735 (2024)
Article Google Scholar
Fu, Z., Wang, W., Huang, Y., Ding, X., Ma, K.K.: Uncertainty inspired underwater image enhancement. In: European Conference on Computer Vision (ECCV). pp. 465–482 (2022)
Google Scholar
Gu, A., Dao, T.: Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752 (2023)
Gu, P., Zhang, Y., Wang, C., Chen, D.Z.: Convformer: Combining cnn and transformer for medical image segmentation. In: International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI). pp. 642–651 (2023)
Google Scholar
Guan, M., Xu, H., Jiang, G., Yu, M., Chen, Y., Luo, T., Song, Y.: WaterMamba: Visual state space model for underwater image enhancement. arXiv preprint arXiv:2405.08419 (2024)
Guo, C., Wu, R., Jin, X., Han, L., Chai, Z., Zhang, W., Li, C.: Underwater Ranker: Learn which is better and how to be better. In: AAAI Conference on Artificial Intelligence (AAAI). pp. 702–709 (2023)
Google Scholar
Guo, H., Li, J., Dai, T., Ouyang, Z., Ren, X., Xia, S.T.: MambaIR: A simple baseline for image restoration with state-space model. arXiv preprint arXiv:2402.15648 (2024)
Huang, S., Wang, K., Liu, H., Chen, J., Li, Y.: Contrastive semi-supervised learning for underwater image restoration via reliable bank. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 18145–18155 (2023)
Google Scholar
Huang, T., Pei, X., You, S., Wang, F., Qian, C., Xu, C.: LocalMamba: Visual state space model with windowed selective scan. arXiv preprint arXiv:2403.09338 (2024)
Jiang, J., Ye, T., Bai, J., Chen, S., Chai, W., Jun, S., Liu, Y., Chen, E.: Five A$^+$ Network: You only need 9k parameters for underwater image enhancement. In: British Machine Vision Conference (BMVC) (2023)
Google Scholar
Korhonen, J., You, J.: Peak signal-to-noise ratio revisited: Is simple beautiful? In: International Workshop on Quality of Multimedia Experience Workshop (QoMEX). pp. 37–38 (2012)
Google Scholar
Lai, W.S., Huang, J.B., Ahuja, N., Yang, M.H.: Deep laplacian pyramid networks for fast and accurate super-resolution. In: IEEE/CVF Conferene on Computer Vision and Pattern Recognition (CVPR). pp. 624–632 (2017)
Google Scholar
Li, C., Anwar, S., Hou, J., Cong, R., Guo, C., Ren, W.: Underwater image enhancement via medium transmission-guided multi-color space embedding. IEEE Transactions on Image Processing 30 (2021)
Google Scholar
Li, C., Guo, C., Ren, W., Cong, R., Hou, J., Kwong, S., Tao, D.: An underwater image enhancement benchmark dataset and beyond. IEEE Trans. Image Process. 29, 4376–4389 (2020)
Article Google Scholar
Li, C., Quo, J., Pang, Y., Chen, S., Wang, J.: Single underwater image restoration by blue-green channels dehazing and red channel correction. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 1731–1735 (2016)
Google Scholar
Lin, Y.X., Tan, D.S., Cheng, W.H., Chen, Y.Y., Hua, K.L.: Spatially-aware domain adaptation for semantic segmentation of urban scenes. In: IEEE International Conference on Image Processing (ICIP). pp. 1870–1874 (2019)
Google Scholar
Liu, R., Fan, X., Zhu, M., Hou, M., Luo, Z.: Real-world underwater enhancement: Challenges, benchmarks, and solutions under natural light. IEEE Trans. Circuits Syst. Video Technol. 30(12), 4861–4875 (2020)
Article Google Scholar
Liu, Y., Tian, Y., Zhao, Y., Yu, H., Xie, L., Wang, Y., Ye, Q., Liu, Y.: VMamba: Visual state space model. arXiv preprint arXiv:2401.10166 (2024)
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: Hierarchical vision transformer using shifted windows. In: IEEE/CVF International Conference on Computer Vision (ICCV). pp. 10012–10022 (2021)
Google Scholar
Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (ICLR) (2017)
Google Scholar
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: International Conference on Learning Representations (ICLR) (2019)
Google Scholar
MMagic Contributors: MMagic: OpenMMLab multimodal advanced, generative, and intelligent creation toolbox. https://github.com/open-mmlab/mmagic (2023)
Naik, A., Swarnakar, A., Mittal, K.: Shallow-UWnet: Compressed model for underwater image enhancement. In: AAAI Conference on Artificial Intelligence (AAAI). pp. 15853–15854 (2021)
Google Scholar
Panetta, K., Gao, C., Agaian, S.: Human-visual-system-inspired underwater image quality measures. IEEE J. Oceanic Eng. 41(3), 541–551 (2016)
Article Google Scholar
Peng, Y.T., Cosman, P.C.: Underwater image restoration based on image blurriness and light absorption. IEEE Trans. Image Process. 26(4), 1579–1594 (2017)
Article MathSciNet Google Scholar
Pramanick, A., Sarma, S., Sur, A.: X-caunet: Cross-color channel attention with underwater image-enhancing transformer. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 3550–3554 (2024)
Google Scholar
Ren, T., Xu, H., Jiang, G., Yu, M., Zhang, X., Wang, B., Luo, T.: Reinforced swin-convs transformer for simultaneous underwater sensing scene image enhancement and super-resolution. IEEE Transactions on Geoscience and Remote Sensing (2022)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI). pp. 234–241 (2015)
Google Scholar
Shahid, M., Chien, I.F., Sarapugdi, W., Miao, L., Hua, K.L.: Deep spatial-temporal networks for flame detection. Multimedia Tools and Applications 80, 1–22 (11 2021)
Google Scholar
Shahid, M., Virtusio, J., Wu, Y.H., Chen, Y.Y., Tanveer, M., Muhammad, K., Hua, K.L.: Spatio-temporal self-attention network for fire detection and segmentation in video surveillance. IEEE Access PP, 1–1 (12 2021)
Google Scholar
Shi, Y., Xia, B., Jin, X., Wang, X., Zhao, T., Xia, X., Xiao, X., Yang, W.: VmambaIR: Visual state space model for image restoration. arXiv preprint arXiv:2403.11423 (2024)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in Neural Information Processing Systems (NeurIPS) 30 (2017)
Google Scholar
Wang, Y., Guo, J., Gao, H., Yue, H.: UIEC2-Net: Cnn-based underwater image enhancement using two color space. Signal Processing: Image Communication 96, 116250 (2021)
Google Scholar
Wang, Z., Bovik, A., Sheikh, H., Simoncelli, E.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Article Google Scholar
Yang, M., Sowmya, A.: An underwater color image quality evaluation metric. IEEE Trans. Image Process. 24(12), 6062–6071 (2015)
Article MathSciNet Google Scholar
Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.H.: Restormer: Efficient transformer for high-resolution image restoration. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 5728–5739 (2022)
Google Scholar
Zhou, J., Sun, J., Zhang, W., Lin, Z.: Multi-view underwater image enhancement method via embedded fusion mechanism. Eng. Appl. Artif. Intell. 121, 105946 (2023)
Article Google Scholar
Zhuang, P., Wu, J., Porikli, F., Li, C.: Underwater image enhancement with hyper-laplacian reflectance priors. IEEE Trans. Image Process. 31, 5442–5455 (2022)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
Wei-Tung Lin, Yong-Xiang Lin, Jyun-Wei Chen & Kai-Lung Hua
Microsoft, Redmond, USA
Kai-Lung Hua

Authors

Wei-Tung Lin
View author publications
You can also search for this author in PubMed Google Scholar
Yong-Xiang Lin
View author publications
You can also search for this author in PubMed Google Scholar
Jyun-Wei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Kai-Lung Hua
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kai-Lung Hua .

Editor information

Editors and Affiliations

University of Science and Technology (POSTECH), Pohang, Korea (Republic of)
Minsu Cho
Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates
Ivan Laptev
Google, Mountain View, CA, USA
Du Tran
National University of Singapore, Singapore, Singapore
Angela Yao
Peking University, Beijing, China
Hongbin Zha

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lin, WT., Lin, YX., Chen, JW., Hua, KL. (2025). PixMamba: Leveraging State Space Models in a Dual-Level Architecture for Underwater Image Enhancement. In: Cho, M., Laptev, I., Tran, D., Yao, A., Zha, H. (eds) Computer Vision – ACCV 2024. ACCV 2024. Lecture Notes in Computer Science, vol 15475. Springer, Singapore. https://doi.org/10.1007/978-981-96-0911-6_11

Download citation

DOI: https://doi.org/10.1007/978-981-96-0911-6_11
Published: 08 December 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-96-0910-9
Online ISBN: 978-981-96-0911-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

PixMamba: Leveraging State Space Models in a Dual-Level Architecture for Underwater Image Enhancement

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Underwater image enhancement using lightweight vision transformer

Learning multiscale pipeline gated fusion for underwater image enhancement

Region gradient-guided diffusion model for underwater image enhancement

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

PixMamba: Leveraging State Space Models in a Dual-Level Architecture for Underwater Image Enhancement

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Underwater image enhancement using lightweight vision transformer

Learning multiscale pipeline gated fusion for underwater image enhancement

Region gradient-guided diffusion model for underwater image enhancement

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation