A physics based generative adversarial network for single image defogging

https://doi.org/10.1016/j.imavis.2019.10.001Get rights and content

Abstract

In the field of single image defogging, there are two main methods. One is the image restoration method based on the atmospheric scattering theory which can recover the image texture details well. The other is the image enhancement method based on Retinex theory which can improve the image contrast well. In practice, however, the former can easily lead to low contrast images; the latter is prone to losing texture details. Therefore, how to effectively combine the advantages of both to remove fog is a key issue in the field. In this paper, we have developed a physics based generative adversarial network (PBGAN) to exploit the advantages between those two methods in parallel. To our knowledge, it is the first learning defogging framework that incorporates these two methods and to enable them to work together and complement each other. Our method has two generative adversarial modules, the Contrast Enhancement (CE) module and the Texture Restoration (TR) module. To improve contrast in the CE module, we introduced a novel inversion-adversarial loss and a novel inversion-cycle consistency loss for training the generator. To improve the texture in the TR module, we introduced two convolutional neural networks to learn the atmospheric light coefficient and the transmission map, respectively. Extensive experiments on both synthetic and real-world datasets demonstrate that the proposed approach performs better than several state-of-the-art methods quantitatively and qualitatively.

Introduction

The imaging process of foggy images is mainly affected by small particles in the atmosphere scattering and absorbing light. This will cause a certain degree of degradation in the captured images' visibility which will not only affect applications such as video surveillance but also the performance of many computer vision applications such as classification and segmentation. Defogging is, therefore, an important image processing task which has in recent year attracted a lot of attention in the computer vision community. In the published literature, there are two major categories of the defogging method, those based on the image enhancement and the image restoration respectively.

The advantage of image enhancement based methods is that they are simple and easy to use, such as histogram equalisation [34], wavelet analysis [8], and Retinex theory [19, 21, 29]. Arguably, Retinex theory based methods are amongst the best. However, these methods only improve the image contrast and sharpness and do not really remove the fog from the image. Moreover, artifacts such as color distortion and halo can appear in the defogged results of these methods. Thus, these kinds of the method are usually used to perform post-processing on the image after defogging.

Image restoration methods are usually based on a physics model. In particular, prior-based methods have drawn a significant attention in this field for the last decade. These methods are also known as the hand-crafted methods and are usually based on the atmospheric scattering model (ASM) [25]. With accurate transmission and scattering light estimation, they can remove fog from images and preserve the edge information. Fattal [10] discovered that surface shading is locally uncorrelated with the transmission, and used it to recover the scene albedo. Tan et al. [36] recovered the visibility of a foggy image by using a patch-based contrast-maximization approach. He et al. [15] presented a dark channel prior (DCP) for single image defogging, which observed statistically that the value of at least one channel of the fog-free image is close to zero. More recently, Zhu et al. [43] found that the scene depth is positively correlated with the difference in brightness and saturation of the scene and used this prior to recovering the scene depth. Fattal [11] proposed a color-lines method based on statistics of natural images that the pixels of small image patches typically exhibit a one-dimensional distribution in the RGB color space. However, hand-crafted methods have their limitations due to the strict constraint conditions. For instance, according to He's [15] statistical irregularity, objects with high intensity values such as the sky and white buildings do not have the dark channel. Thus, this method would fail when dealing with them. In addition, the contrast of the defogged results through the physics model typically looks dim.

In summary, both categories of methods have their own advantages and disadvantages. The question is, is there a way to combine the advantages of both to accomplish single image defogging. Recently, deep learning-based methods [5, 30, 38] have shown powerful defogging ability. They are able to overcome the disadvantages of hand-crafted priors. Cai et al. [5] proposed an end-to-end convolutional neural network to learn fog features from the data to estimate the transmission with a novel BReLU unit. Ren et al. [30] combined a coarse-scale network and a fine-scale network called the Multi-Scale convolutional neural network (MSCNN) to optimize the transmission. Zhang et al. [38] proposed a densely connected encoder-decoder structure with multi-level pyramid pooling module for estimating the transmission map. An Aod-Net [22] was proposed to generate a clean image through a light-weight CNN. Ren et al. [31] proposed a novel fusion-based network to learn the features of three derived inputs from a foggy image by applying white balance, contrast enhancing and gamma correction. The final defogged result is obtained by gating these features. However, these methods are still based on the atmospheric scattering model. They inevitably result in artifacts in the defogged results due to the wrong transmission and atmospheric light estimation. In addition, they are mostly trained with indoor synthetic foggy images. Thus, it is unreasonable to expect such methods to work well on outdoor images. To address this problem, a series of the generative adversarial network (GAN) based methods were proposed to directly recover clear images without using this model. Li et al. [24] modified the basic cGAN [26] to directly restore a clear image from a foggy image, which used VGG features and an L1-regularised gradient prior to supervise the loss function. In [9], an enhanced CycleGAN was proposed by combining the cycle-consistency and VGG perceptual losses to directly generate a clean image. This method trains network with unpaired samples. Inspired by the ability of these methods to generate high-quality images with fine texture details, we propose a novel physics based generative adversarial network to remove fog from a single foggy image. Different from previous approaches, our network bridges between image enhancement and image restoration to generate a clear image by learning two mapping functions via a cycle-enhance-restore generative adversarial framework. It consists of a Retinex-based contrast enhancement mapping network, an atmospheric scattering model (ASM) based texture restoration mapping network and two discriminative networks, as shown in Fig. 2. The proposed method nicely bridges the gap between the two approaches and leverages their respective strengths. We show that our model performs favourably against several state-of-the-art methods on both image quality metrics and visual inspection.

This paper makes the following contributions:

  • (1)

    A novel end-to-end Physics Based Generative Adversarial Network (PBGAN) for single image defogging is proposed. The new method embeds the Retinex model and the atmospheric scattering model within the cycle generative adversarial network framework. The Retinex model plays the role of image enhancement and the atmospheric scattering light model helps preserve the texture information of the input image. The cycle generative adversarial network brings the two models together and enables training based on unpaired samples.

  • (2)

    A novel physics based texture recovery sub-network is proposed. It combines a transmission network (TranNet) and an atmospheric light network (AtmLightNet) with the atmospheric scattering model to directly generate a foggy image without using a depth map. In this way, the network can preserve more texture information from the original inputs to generate better performance.

  • (3)

    In order to enable the Retinex theory and the atmospheric scattering model to work effectively together and to complement each other, we propose a novel inversion-adversarial loss function and a novel inversion-cycle consistency loss function to constrain the generators. This enables our network to generate a clear image with higher contrast from a foggy image.

Section snippets

Retinex model

Retinex is a color vision model introduced by Edwin H. Land [21], it is a portmanteau of Retina and Cortex. This model assumes that an image can be decomposed into two components, reflection and illumination: I(x)=R(x)L(x)where I denotes the observed image, R and L represent the reflection and illumination respectively. For a foggy image I, R is our desired recovered image. In practice, in order to obtain the reflection component, we first calculate the illumination by applying a Gaussian

PBGAN defogging method

This section presents the details of our physics based generative adversarial network that employs the Retinex and Atmospheric degradation models. We refer to this network as PBGAN, as shown in Fig. 2. It consists of two modules, a Retinex based enhancement module and an ASM based restoration module. In the Retinex based enhancement module, we combine an end-to-end network (RetNet) with Eq. (2) to enhance the image brightness. In the ASM based restoration module, we make explicit use of two

Experimental results

In this section, we qualitatively and quantitatively compare defogged results on synthetic and real-world images by our proposed approach against three other state-of-the-art approaches including DCP [15], FD [2] and Dehazenet [5]. The reason why these three methods are chosen is because DCP is the best paper of CVPR 2009. This method was not only simple and effective, but also led to the renewed prosperity in the field of defogging in the following years. Moreover, to our best knowledge, FD is

Concluding remarks

This paper has presented a new defogging method based on a cycle generative adversarial network framework. The proposed method effectively combines the image enhancement method and the image restoration method to remove the fog from a single foggy image. This compensates for the shortcomings of the two methods in defogging applications, such as over enhancement, color distortion, and low contrast. In the training modules, we exploit an inversion operator based Retinex model to strengthen the

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (1)

    Cited by (21)

    • Efficient image dehazing algorithm using multiple priors constraints

      2023, Journal of Visual Communication and Image Representation
      Citation Excerpt :

      Feng [21] proposed a deep masked generative network (DMGN) starting from image background restoration. Generative adversarial networks also play a role in image dehazing [22]. Chen [23] proposed an end-to-end gated context aggregation network (GCANet) that uses a smooth dilated convolution to solve the problem of gridding artefacts.

    • An efficient single image dehazing algorithm based on patch-wise transmission map estimation using Whale Optimization Algorithm

      2023, Handbook of Whale Optimization Algorithm: Variants, Hybrids, Improvements, and Applications
    • FogAdapt: Self-supervised domain adaptation for semantic segmentation of foggy images

      2022, Neurocomputing
      Citation Excerpt :

      Color quality and contrast of the outdoor scenes are degraded due to fog/haze. There have been many classical [38–41] and deep learning [14,42–44,26,45] based methods trying to improve color quality or contrast enhancement with an attempt to defog or dehaze. However, as the fog density increases, the defogging models’ performance is degraded significantly.

    View all citing articles on Scopus

    This paper has been recommended for acceptance by Sinisa Todorovic.

    View full text