Attention-guided dynamic multi-branch neural network for underwater image enhancement
Introduction
Underwater image enhancement (UIE) is a fundamental operation in the computer vision community that has become a hot spot in the field of image processing in recent years. The main purpose of UIE is to recover a clean image by eliminating degradations (e.g., color deviation and low contrast caused by wavelength-dependent attenuation) from its corresponding degraded version [1], [2], [3]. Research on this problem can be used in latent applications, such as underwater detection [4] and marine environmental surveillance [5]. However, it is an extremely challenging and ill-posed task, owing to medium attenuation properties or the diversity of underwater image distributions.
Studies show that earlier techniques expressly applied image priors handcrafted with empirical observations. These prior-based techniques often perform worse than expected because constructing strong image priors is difficult and usually fails to generalize. Recently, the rapid development of convolutional neural networks (CNNs) has promoted new strategies and perspectives for problem solving in the field of image processing. Therefore, CNNs have emerged as a dominant UIE approach, which can achieve state-of-the-art results with extraordinary feature representation capabilities.
Natural scenarios possess different areas, which should be taken into account at distinct scales. For instance, smooth areas correspond to features at larger scales, while the features in textured areas correspond to smaller ones [6]. However, most CNN-based UIE methods generally possess a finite receptive field (RF), which makes it difficult to estimate multiscale features. This inevitably limits their applicability to the diversity of water types and complex degradation levels.
To ameliorate the aforementioned problems, we present an effective CNN-based UIE solution. Following this intuition, to accomplish this challenging task, we build on two main facts/ observations. First, a few UIE techniques [7], [8] based on multiscale CNNs have been introduced, in which a linear manner is used to merge multiscale features. However, numerous studies [9], [10], [11] have demonstrated that the RF sizes of neurons in the same region (e.g., the primary visual cortex) are different but naturally adjusted via the stimulus. Therefore, these techniques cannot truly imitate the astonishing capability of neurons to adaptively integrate visual inputs, thereby restricting their performance. Second, enhancing underwater images is a wavelength-sensitive task due to variable levels of light attenuation for different wavelengths [12], [13], [14]. Each channelwise feature in an image embodies distinct types of input information. Some features may help to cope with the issues of color deviation, and others may contribute to improving the low contrast. If the interdependencies across channels are not taken into consideration, this will weaken the representational power of the network.
Based on the above motivation, a building block called the attention-guided dynamic multibranch block (ADMB) is introduced, which extracts the desired dynamic information from inputs at different branches. ADMB is mainly composed of a dynamic feature selection module (DFSM) and a multiscale channel attention-guided module (MCAM). To effectively advance the UIE task, multiple ADMB units are stacked in an end-to-end architecture, which is termed the attention-guided dynamic multibranch neural network (ADMNNet). Concretely, DFSM is proposed to exploit the gain of RF properties, which is responsible for receiving multiple branches with different RF sizes. To produce a final and global representation for the selection weights, the information from multiple branches is merged. An attention mechanism is applied to estimate selection weights, which can adaptively underline the most representative features. MCAM is proposed to improve the channel variant ability of the network. The proposed ADMNNet can effectively remove color deviation and improve detailed information, as shown in Fig. 1. Briefly, the significant contributions of this work are highlighted as follows.
We propose an attention-guided dynamic multibranch neural network (ADMNNet) to achieve superiority and adaptability for complex and numerous underwater images. Extensive experiments show that the proposed model introduces excellent robustness and flexibility when compared with state-of-the-art methods.
We develop a dynamic feature selection module (DFSM), in which the RF size of neurons can be adaptively modified by stimulation. More importantly, soft attention is used to fulfill the selective kernel mechanism between multiscale features. Put differently, our network has a self-adaptive adjustment function based on the contextual information of the input.
We design a multiscale channel attention-guided module (MCAM), which can be leveraged simultaneously to explore channelwise information through a more effective approach.
The rest of this paper is organized as follows: Section 2 introduces related work. The proposed method is presented in Section 3. Section 4 evaluates and compares the experimental results. Section 5 concludes the paper.
Section snippets
Related work
Handcrafted prior-based methods. Based on the underwater physical imaging model [17], a large number of handcrafted methods [18], [19], [20] execute the enhancement procedure in an inverse manner. Without any extra information, specific priors [21] as constraints are used to estimate the derived parameters (background light and transmission map) of the physical model [22]. For example, Drews et al. [23] proposed the underwater dark channel prior (UDCP), which compensated for the attenuation by
Proposed method
In this section, we describe an end-to-end attention-guided dynamic multibranch neural network (ADMNNet) for underwater image enhancement tasks. ADMNNet includes the core components of the proposed attention-guided dynamic multibranch block (ADMB): (a) dynamic feature selection module (DFSM) and (b) multiscale channel attention-guided module (MCAM). The overall architecture of our ADMNNet is illustrated in Fig. 2. Given an underwater image as input, ADMNNet first applies a depthwise
Experimental settings
Datasets. To train ADMNNet, both real-world and synthetic underwater images are used. First, 800 paired images are randomly selected from the UIEB [34] dataset as the training set. In detail, the UIEB dataset contains 890 real underwater images with manually selected reference images. For testing, the remaining 90 images with references are employed, denoted as Test-R90 [34]. Although the original images in the UIEB dataset have different levels of contrast reduction and diverse scenes, the
Conclusion
In this paper, we propose an efficient yet simple attention-guided dynamic multibranch neural network (ADMNNet) for underwater image enhancement tasks. To learn the feature representations from different branches, an attention-guided dynamic multibranch block (ADMB) is designed. The ADMB, as the core component of our model, is composed of the dynamic feature selection module (DFSM) and multiscale channel attention-guided module (MCAM). Specifically, our DFSM implicitly models multiscale
CRediT authorship contribution statement
Xiaohong Yan: Writing – original draft, Writing – review & editing, Conceptualization, Methodology. Wenqiang Qin: Software. Yafei Wang: Writing – review & editing, Funding acquisition. Guangyuan Wang: Validation, Data curation. Xianping Fu: Supervision, Project administration, Funding acquisition.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
The authors sincerely thank the editors and anonymous reviewers for the very helpful and kind comments to assist in improving the presentation of our paper. This work was supported in part by the National Natural Science Foundation of China under Grant 62176037, Grant 62002043, and Grant 61802043, by the Liaoning Revitalization Talents Program, China under Grant XLYC1908007, by the Foundation of Liaoning Key Research and Development Program, China under Grant 201801728, by the Dalian Science
References (57)
- et al.
Single underwater image enhancement using integrated variational model
Digit. Signal Process.
(2022) - et al.
A unified total variation method for underwater image enhancement
Knowl.-Based Syst.
(2022) - et al.
Spatio-contextual Gaussian mixture model for local change detection in underwater video
Expert Syst. Appl.
(2018) Recognition of fish species by colour and shape
Image Vis. Comput.
(1993)- et al.
Enhancing underwater image via adaptive color and contrast enhancement, and denoising
Eng. Appl. Artif. Intell.
(2022) - et al.
Automatic red-channel underwater image restoration
J. Vis. Commun. Image Represent.
(2015) - et al.
Color image dehazing using gradient channel prior and guided L0 filter
Inform. Sci.
(2020) - et al.
Depth-aware total variation regularization for underwater image dehazing
Signal Process., Image Commun.
(2021) - et al.
A novel biologically-inspired method for underwater image enhancement
Signal Process., Image Commun.
(2022) - et al.
Underwater image enhancement with global–local networks and compressed-histogram equalization
Signal Process., Image Commun.
(2020)
FloodNet: Underwater image restoration based on residual dense learning
Signal Process., Image Commun.
Haze transfer and feature aggregation network for real-world single image dehazing
Knowl.-Based Syst.
A novel image-dehazing network with a parallel attention block
Pattern Recognit.
Underwater image enhancement via minimal color loss and locally adaptive contrast enhancement
IEEE Trans. Image Process.
Learning multiscale sparse representations for image and video restoration
Multiscale Model. Simul.
LAFFNet: A lightweight adaptive feature fusion network for underwater image enhancement
Adaptive learning attention network for underwater image enhancement
IEEE Robot. Autom. Lett.
Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex
J. Physiol.
Single image super-resolution using multi-scale deep encoder-decoder with phase congruency edge map guidance
Inform. Sci.
Deep neural networks for sensor-based human activity recognition using selective kernel convolution
IEEE Trans. Instrum. Meas.
Multi-purpose oriented real-world underwater image enhancement
IEEE Access
A natural-based fusion strategy for underwater image enhancement
Multimed. Tools Appl.
Enhancing underwater images and videos by fusion
Underwater image enhancement via medium transmission-guided multi-color space embedding
IEEE Trans. Image Process.
A computer model for underwater camera systems
Ocean Opt. VI
A variational framework for underwater image dehazing and deblurring
IEEE Trans. Circuits Syst. Video Technol.
Transmission estimation in underwater single images
Single image haze removal using dark channel prior
IEEE Trans. Pattern Anal. Mach. Intell.
Cited by (9)
A dual-branch joint learning network for underwater object detection
2024, Knowledge-Based SystemsMulti-scale cross-layer feature interaction GAN for underwater image enhancement
2024, Digital Signal Processing: A Review JournalUnderwater Organism Color Fine-Tuning via Decomposition and Guidance
2024, Proceedings of the AAAI Conference on Artificial IntelligenceTwo-Branch Underwater Image Enhancement and Original Resolution Information Optimization Strategy in Ocean Observation
2023, Journal of Marine Science and Engineering