A physics based generative adversarial network for single image defogging

doi:10.1016/j.imavis.2019.10.001

Image and Vision Computing

Volume 92, December 2019, 103815

https://doi.org/10.1016/j.imavis.2019.10.001 Get rights and content

Abstract

In the field of single image defogging, there are two main methods. One is the image restoration method based on the atmospheric scattering theory which can recover the image texture details well. The other is the image enhancement method based on Retinex theory which can improve the image contrast well. In practice, however, the former can easily lead to low contrast images; the latter is prone to losing texture details. Therefore, how to effectively combine the advantages of both to remove fog is a key issue in the field. In this paper, we have developed a physics based generative adversarial network (PBGAN) to exploit the advantages between those two methods in parallel. To our knowledge, it is the first learning defogging framework that incorporates these two methods and to enable them to work together and complement each other. Our method has two generative adversarial modules, the Contrast Enhancement (CE) module and the Texture Restoration (TR) module. To improve contrast in the CE module, we introduced a novel inversion-adversarial loss and a novel inversion-cycle consistency loss for training the generator. To improve the texture in the TR module, we introduced two convolutional neural networks to learn the atmospheric light coefficient and the transmission map, respectively. Extensive experiments on both synthetic and real-world datasets demonstrate that the proposed approach performs better than several state-of-the-art methods quantitatively and qualitatively.

Introduction

The imaging process of foggy images is mainly affected by small particles in the atmosphere scattering and absorbing light. This will cause a certain degree of degradation in the captured images' visibility which will not only affect applications such as video surveillance but also the performance of many computer vision applications such as classification and segmentation. Defogging is, therefore, an important image processing task which has in recent year attracted a lot of attention in the computer vision community. In the published literature, there are two major categories of the defogging method, those based on the image enhancement and the image restoration respectively.

The advantage of image enhancement based methods is that they are simple and easy to use, such as histogram equalisation [34], wavelet analysis [8], and Retinex theory [19, 21, 29]. Arguably, Retinex theory based methods are amongst the best. However, these methods only improve the image contrast and sharpness and do not really remove the fog from the image. Moreover, artifacts such as color distortion and halo can appear in the defogged results of these methods. Thus, these kinds of the method are usually used to perform post-processing on the image after defogging.

Image restoration methods are usually based on a physics model. In particular, prior-based methods have drawn a significant attention in this field for the last decade. These methods are also known as the hand-crafted methods and are usually based on the atmospheric scattering model (ASM) [25]. With accurate transmission and scattering light estimation, they can remove fog from images and preserve the edge information. Fattal [10] discovered that surface shading is locally uncorrelated with the transmission, and used it to recover the scene albedo. Tan et al. [36] recovered the visibility of a foggy image by using a patch-based contrast-maximization approach. He et al. [15] presented a dark channel prior (DCP) for single image defogging, which observed statistically that the value of at least one channel of the fog-free image is close to zero. More recently, Zhu et al. [43] found that the scene depth is positively correlated with the difference in brightness and saturation of the scene and used this prior to recovering the scene depth. Fattal [11] proposed a color-lines method based on statistics of natural images that the pixels of small image patches typically exhibit a one-dimensional distribution in the RGB color space. However, hand-crafted methods have their limitations due to the strict constraint conditions. For instance, according to He's [15] statistical irregularity, objects with high intensity values such as the sky and white buildings do not have the dark channel. Thus, this method would fail when dealing with them. In addition, the contrast of the defogged results through the physics model typically looks dim.

In summary, both categories of methods have their own advantages and disadvantages. The question is, is there a way to combine the advantages of both to accomplish single image defogging. Recently, deep learning-based methods [5, 30, 38] have shown powerful defogging ability. They are able to overcome the disadvantages of hand-crafted priors. Cai et al. [5] proposed an end-to-end convolutional neural network to learn fog features from the data to estimate the transmission with a novel BReLU unit. Ren et al. [30] combined a coarse-scale network and a fine-scale network called the Multi-Scale convolutional neural network (MSCNN) to optimize the transmission. Zhang et al. [38] proposed a densely connected encoder-decoder structure with multi-level pyramid pooling module for estimating the transmission map. An Aod-Net [22] was proposed to generate a clean image through a light-weight CNN. Ren et al. [31] proposed a novel fusion-based network to learn the features of three derived inputs from a foggy image by applying white balance, contrast enhancing and gamma correction. The final defogged result is obtained by gating these features. However, these methods are still based on the atmospheric scattering model. They inevitably result in artifacts in the defogged results due to the wrong transmission and atmospheric light estimation. In addition, they are mostly trained with indoor synthetic foggy images. Thus, it is unreasonable to expect such methods to work well on outdoor images. To address this problem, a series of the generative adversarial network (GAN) based methods were proposed to directly recover clear images without using this model. Li et al. [24] modified the basic cGAN [26] to directly restore a clear image from a foggy image, which used VGG features and an L₁-regularised gradient prior to supervise the loss function. In [9], an enhanced CycleGAN was proposed by combining the cycle-consistency and VGG perceptual losses to directly generate a clean image. This method trains network with unpaired samples. Inspired by the ability of these methods to generate high-quality images with fine texture details, we propose a novel physics based generative adversarial network to remove fog from a single foggy image. Different from previous approaches, our network bridges between image enhancement and image restoration to generate a clear image by learning two mapping functions via a cycle-enhance-restore generative adversarial framework. It consists of a Retinex-based contrast enhancement mapping network, an atmospheric scattering model (ASM) based texture restoration mapping network and two discriminative networks, as shown in Fig. 2. The proposed method nicely bridges the gap between the two approaches and leverages their respective strengths. We show that our model performs favourably against several state-of-the-art methods on both image quality metrics and visual inspection.

This paper makes the following contributions:

(1)
A novel end-to-end Physics Based Generative Adversarial Network (PBGAN) for single image defogging is proposed. The new method embeds the Retinex model and the atmospheric scattering model within the cycle generative adversarial network framework. The Retinex model plays the role of image enhancement and the atmospheric scattering light model helps preserve the texture information of the input image. The cycle generative adversarial network brings the two models together and enables training based on unpaired samples.
(2)
A novel physics based texture recovery sub-network is proposed. It combines a transmission network (TranNet) and an atmospheric light network (AtmLightNet) with the atmospheric scattering model to directly generate a foggy image without using a depth map. In this way, the network can preserve more texture information from the original inputs to generate better performance.
(3)
In order to enable the Retinex theory and the atmospheric scattering model to work effectively together and to complement each other, we propose a novel inversion-adversarial loss function and a novel inversion-cycle consistency loss function to constrain the generators. This enables our network to generate a clear image with higher contrast from a foggy image.

Section snippets

Retinex model

Retinex is a color vision model introduced by Edwin H. Land [21], it is a portmanteau of Retina and Cortex. This model assumes that an image can be decomposed into two components, reflection and illumination: $I (x) = R (x) \cdot L (x)$ where I denotes the observed image, R and L represent the reflection and illumination respectively. For a foggy image I, R is our desired recovered image. In practice, in order to obtain the reflection component, we first calculate the illumination by applying a Gaussian

PBGAN defogging method

This section presents the details of our physics based generative adversarial network that employs the Retinex and Atmospheric degradation models. We refer to this network as PBGAN, as shown in Fig. 2. It consists of two modules, a Retinex based enhancement module and an ASM based restoration module. In the Retinex based enhancement module, we combine an end-to-end network (RetNet) with Eq. (2) to enhance the image brightness. In the ASM based restoration module, we make explicit use of two

Experimental results

In this section, we qualitatively and quantitatively compare defogged results on synthetic and real-world images by our proposed approach against three other state-of-the-art approaches including DCP [15], FD [2] and Dehazenet [5]. The reason why these three methods are chosen is because DCP is the best paper of CVPR 2009. This method was not only simple and effective, but also led to the renewed prosperity in the field of defogging in the following years. Moreover, to our best knowledge, FD is

Concluding remarks

This paper has presented a new defogging method based on a cycle generative adversarial network framework. The proposed method effectively combines the image enhancement method and the image restoration method to remove the fog from a single foggy image. This compensates for the shortcomings of the two methods in defogging applications, such as over enhancement, color distortion, and low contrast. In the training modules, we exploit an inversion operator based Retinex model to strengthen the

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (1)

Cited by (21)

AGLC-GAN: Attention-based global-local cycle-consistent generative adversarial networks for unpaired single image dehazing
2023, Image and Vision Computing
Image dehazing is a critical image pre-processing task to estimate the haze-free images corresponding to the input hazy images. Despite the recent advances, the task of image dehazing remains challenging, especially in the unsupervised scenario. Several efforts can be found in the literature to dehaze images in a supervised set-up, where a huge number of paired (clear and hazy images) images are required for training. The supervised approaches often become biased towards the nature of haze present in the training hazy images, and produce less realistic images for query hazy images. We propose an Attention-based Global–Local Cycle-consistent Generative Adversarial Network (AGLC-GAN) for Unpaired Single Image Dehazing. The proposed CycleGAN-based AGLC-GAN model contains a dehazing generator encapsulating an autoencoder-like network with an attention mechanism comprising channel attention and pixel attention to deal with uneven haze intensity across the image. We use a global–local consistent discriminator to identify spatially varying haze and improve the stability of the discriminator. We adopt cyclic perceptual consistency loss to maintain consistency in the feature space. A dynamic feature enhancement module and an adaptive mix-up module are included in the proposed generator to dynamically obtain more spatially structured features and hence, adaptively preserve the flow of shallow features. Furthermore, we extensively experiment with the proposed model on multiple benchmark datasets for evaluating the efficacy of removing haze. The results of the experiments conducted in the study, demonstrate a significant quantitative and qualitative improvement over the existing methods for unpaired image dehazing. ¹
Multi-image blending holographic encryption system based on multi-source coherent diffraction and frequency-domain attention learning
2023, Optics Communications
The multi-image encryption method has the advantage of large information encryption capacity, which is conducive to the development of high-throughput cryptosystems. However, for multi-image encryption, the low PSNR caused by the crosstalk between images restricts the development of multi-image encryption methods. In our method, a novel optical multi-image encryption method based on multi-image coherent diffraction and frequency-domain attention learning is proposed. For encrypting, random aberrations brought about by diffraction are used instead of random phase masks to encrypt multiple source images into a single holographic ciphertext image based on holography and coherent diffraction imaging. A specially designed deep learning decryption network based on physical prior is proposed, which is used to resolve the source images of each channel respectively. Highlights and advantages include lower computational cost algorithm, higher real-time performance and better decryption quality. The feasibility, noise immunity, security and robustness of the proposed method are investigated through simulations and experiments. It is believed that the method we propose here can promote the further development of multi-optical image encryption systems.
Object tracking and detection techniques under GANN threats: A systemic review
2023, Applied Soft Computing
Current developments in object tracking and detection techniques have directed remarkable improvements in distinguishing attacks and adversaries. Nevertheless, adversarial attacks, intrusions, and manipulation of images/ videos threaten video surveillance systems and other object-tracking applications. Generative adversarial neural networks (GANNs) are widely used image processing and object detection techniques because of their flexibility in processing large datasets in real-time. GANN training ensures a tamper-proof system, but the plausibility of attacks persists. Therefore, reviewing object tracking and detection techniques under GANN threats is necessary to reveal the challenges and benefits of efficient defence methods against these attacks. This paper aims to systematically review object tracking and detection techniques under threats to GANN-based applications. The selected studies were based on different factors, such as the year of publication, the method implemented in the article, the reliability of the chosen algorithms, and dataset size. Each study is summarised by assigning it to one of the two predefined tasks: applying a GANN or using traditional machine learning (ML) techniques. First, the paper discusses traditional applied techniques in this field. Second, it addresses the challenges and benefits of object detection and tracking. Finally, different existing GANN architectures are covered to justify the need for tamper-proof object tracking systems that can process efficiently in a real-time environment.
Efficient image dehazing algorithm using multiple priors constraints
2023, Journal of Visual Communication and Image Representation
Citation Excerpt :
Feng [21] proposed a deep masked generative network (DMGN) starting from image background restoration. Generative adversarial networks also play a role in image dehazing [22]. Chen [23] proposed an end-to-end gated context aggregation network (GCANet) that uses a smooth dilated convolution to solve the problem of gridding artefacts.
In this study, a robust and efficient image dehazing technique based on the atmospheric scattering model is proposed, which effectively overcomes the limitations of a single prior condition. It is composed of a transmission estimation module and an atmospheric light estimation module. The transmission estimation module integrates multiple dehazing prior strategies and effectively optimises transmission estimation and application range. The atmospheric light estimation module uses the fuzzy C-means clustering algorithm (FCM) to estimate the atmospheric light of different scenes in an image. Unlike in the previous work, the atmospheric light in this module is a nonglobal value, and a pixel-level atmospheric light value matrix is obtained. Numerous experiments show that the proposed dehazing algorithm is superior to state-of-the-art methods.
An efficient single image dehazing algorithm based on patch-wise transmission map estimation using Whale Optimization Algorithm
2023, Handbook of Whale Optimization Algorithm: Variants, Hybrids, Improvements, and Applications
In automated driving and video surveillance, image dehazing is a regular post-processing step, which can improve image visual quality that has been affected due to scattering and absorption of propagated light under hazy weather condition. To overcome this situation, we proposed single image dehazing method by using Whale Optimization Algorithm, called WOA-Dehaze. The proposed technique has three components: DHWT - Discrete Haar Wavelet Transform to Partition the supplied hazy image into sub-bands, estimating local atmospheric light, estimating each patch's transmission map, and fine-tuning with the Whale Optimization Algorithm. The information loss term, the image contrast term, and the fog density term are used to figure out the fitness cost for a single image. By reducing the cost function, the WOA-dehaze approach optimizes the gradient, image contrast, and information preservation. Extensive research on images of different scenes shows that the proposed WOA-Dehaze method outperforms existing fog removal techniques in terms of both quantitative accuracy and qualitative visual effect.
FogAdapt: Self-supervised domain adaptation for semantic segmentation of foggy images
2022, Neurocomputing
Citation Excerpt :
Color quality and contrast of the outdoor scenes are degraded due to fog/haze. There have been many classical [38–41] and deep learning [14,42–44,26,45] based methods trying to improve color quality or contrast enhancement with an attempt to defog or dehaze. However, as the fog density increases, the defogging models’ performance is degraded significantly.
This paper presents FogAdapt, a novel approach for domain adaptation of semantic segmentation for dense foggy scenes. Although significant research has been directed to reduce the domain shift in semantic segmentation, adaptation to scenes with adverse weather conditions remains an open question. Large variations in the visibility of the scene due to weather conditions, such as fog, smog, and haze, exacerbate the domain shift, thus making unsupervised adaptation in such scenarios challenging. We propose a self-entropy and multi-scale information augmented self-supervised domain adaptation method (FogAdapt) to minimize the domain shift in foggy scenes segmentation. Supported by the empirical evidence that an increase in fog density results in high self-entropy for segmentation probabilities, we introduce a self-entropy based loss function to guide the adaptation method. Furthermore, inferences obtained at different image scales are combined and weighted by the uncertainty to generate scale-invariant pseudo-labels for the target domain. These scale-invariant pseudo-labels are robust to visibility and scale variations. We evaluate the proposed model on real clear-weather scenes to real foggy scenes adaptation and synthetic non-foggy images to real foggy scenes adaptation scenarios. Our experiments demonstrate that FogAdapt significantly outperforms the current state-of-the-art in semantic segmentation of foggy images. Specifically, by considering the standard settings compared to state-of-the-art (SOTA) methods, FogAdapt gains 3.8% on Foggy Zurich, 6.0% on Foggy Driving-dense, and 3.6% on Foggy Driving in mIoU when adapted from Cityscapes to Foggy Zurich.

View all citing articles on Scopus

^☆: This paper has been recommended for acceptance by Sinisa Todorovic.

View full text

A physics based generative adversarial network for single image defogging☆