Keywords

1 Introduction

Images acquired in outdoor scenarios often suffer from the effects of atmospheric phenomena such as fog or haze. The main characteristic of these phenomena is light scatter. The scattering effect distorts contrast and colour in the image, decreasing the visibility of content in the scene and reducing the visual quality.

Koschmieder [15] defined a model of how the atmospheric phenomena affects the output images. The model depends on two parameters: a depth-dependent transmission (\(\varvec{t}\)), and the colour of the airlight (\(\varvec{A}\)). Mathematically, the model is written as

$$\begin{aligned} \varvec{I_{x,\cdot }}= t_x \cdot \varvec{J_{x,\cdot }} +(1-t_x) \cdot \varvec{A}. \end{aligned}$$
(1)

Here x is a particular image pixel, \(\varvec{J_{x,\cdot }}\) is the 1-by-3 vector of the R,G,B values at pixel x of the clear image (i.e., how the image would look without atmospheric scatter) and \(\varvec{I}_{x,\cdot }\) is the 1-by-3 vector of the R,G,B values at pixel x of the image presenting the scattering effect. We remark that the transmission \(\varvec{t}\) only depends on the depth of the image, and therefore it is supposed to be equal for the three colour channels.

Fig. 1.
figure 1

Recurring problems of non-physical based dehazing methods. From left to right: original image, the solutions of the methods of Galdran et al. [11] (top), and Choi et al. [6] (bottom), and the result of the post-processing introduced in this paper. We can clearly see the artefacts in the result of Galdran et al. and the over-saturation in the result of Choi et al. Both problems are solved by the post-processing approach proposed in this paper.

Image dehazing methods -i.e. methods that given a hazy image \(\varvec{I}\), obtain a clear image \(\varvec{J}\)- are becoming crucial for computer vision, because there are several methods -for recognition and classification among other tasks- that are supposed to work in the wild. Some examples are those used for surveillance through CCTV cameras, tracking, or the self-driving of vehicles and drones. However, the vast majority of these methods are devised for clear images, and tend to fail under adverse weather conditions. Image dehazing methods can be roughly divided in two categories: (i) physical-based methods that estimate the transmission of the image and solve for the clear image by inverting Eq. 1 [3, 4, 7, 13, 18, 21, 23, 24, 27], and (ii) image processing methods that directly process the hazy image so as to obtain a dehazed image but without considering the previous equation (from now on, we will call these methods non-physical dehazing methods) [2, 6, 10,11,12, 25, 26].

In this paper we focus on non-physical dehazing methods. This type of methods are able to obtain state-of-the-art results, but may sometimes present over-saturated colours and colour artefacts mostly because a different transmission is obtained for each colour channel. An example of the problems just mentioned is shown in Fig. 1 where, from left to right, we show two original images, the results from the methods of Galdran et al. [11] (top) and Choi et al. [6] (bottom), and the results obtained by using the approach of this paper.

There are very few proven methods that specifically look at reducing the colour artefacts that appear in dehazed images. Matlin and Milanfar [17] proposed an iterative regression method to simultaneously perform denoising and dehazing. Li et al. [16] decomposed the image into high and low frequencies, performing the dehazing only in the low frequencies, thus avoiding blocking artifacts. Chen et al. [5] applied both a smoothing filter for the refinement of the transmission and an energy minimisation in the recovery phase to avoid the appearance of gradients in the output image that were not presented in the original image.

In this paper we present a post-processing model for non-physical dehazing methods that aims at providing an output image that accomplishes the physical constraints given by Eq. 1. Our method is based on a channel-coupling approach, and it is devised to obtain a single transmission for all the different colour channels. Furthermore, our method also improves on the estimation of the airlight colour.

2 Imposing a Physically Plausible Dehazing

In this section, we define our approach for the post-processing of non-physical dehazing methods. Our main goal is, given an original hazy image and the solution of a non-physical dehazing method, to obtain a single transmission and an airlight that minimise the error of Eq. 1. We can write this minimisation in matrix form as:

$$\begin{aligned} \{\varvec{A^{our}},\varvec{t^{our}}\}=arg min_{\varvec{A^*},\varvec{t^*}} \Vert (\varvec{1}-\varvec{t^*})\cdot \varvec{A^*}-\varvec{I}+\varvec{T^*} \odot \varvec{J} \Vert . \end{aligned}$$
(2)

where, \(\varvec{1}\) is a N-by-1 vector that has a value of 1 in every entry, \(\varvec{t^*}\) is a N-by-1 vector that represents the transmission, \(\varvec{A^*}\) is a 1-by-3 vector that provides us with the airlight, \(\varvec{I}\), \(\varvec{J}\) are N-by-3 matrices representing the input image, and the non-physical dehazing solution, N is the number of pixels, \(\varvec{T^*}\) is a N-by-3 matrix consisting on the replication of \(\varvec{t^*}\) three times, and \(\odot \) represents the element-wise multiplication.

It is clear that to solve for this equation, we need to select an input guessing for either \(\varvec{A^{our}}\) or \(\varvec{t^{our}}\). This is not a problem, since a standard hypothesis used in many image dehazing works is to select \(\varvec{A^{our}} =[1,1,1]\). Equation 2 also teaches us that we should perform the minimisation iteratively in two different dimensions. When we look for \(\varvec{t^{our}}\) we should perform the minimisation for each pixel x of the image over the three colour channels, while when we look for \(\varvec{A^{our}}\) we should perform the minimisation for each colour channel c over all the pixels.

We now detail our iterative minimisation. Let us start by having \(\varvec{I}\), \(\varvec{J}\), and the initial guessing for \(\varvec{A^{our}}\). In this case we can solve for the value of \(\varvec{t^{our}}\) at each pixel value x using a least squares minimisation:

$$\begin{aligned} \forall (x) t^{our}_x=arg min_{t^{*}_x} \Vert (\varvec{I_{x,\cdot }}-\varvec{A^{our}})-t^{*}_x \cdot (\varvec{J_{x,\cdot }}-\varvec{A^{our}}) \Vert _{2}. \end{aligned}$$
(3)

As stated in the introduction, \(\varvec{J_{x,\cdot }}\) and \(\varvec{I_{x,\cdot }}\) are the 1-by-3 colour vectors at pixel x. This least squares minimisation has the following solution

$$\begin{aligned} \forall (x) t^{our}_x=(\varvec{I_{x,\cdot }}-\varvec{A^{our}})(\varvec{J_{x,\cdot }}-\varvec{A^{our}})^T((\varvec{J_{x,\cdot }}-\varvec{A^{our}})(\varvec{J_{x,\cdot }}-\varvec{A^{our}})^T)^{-1}. \end{aligned}$$
(4)

where \(^T\) denotes the transpose of the vector.

Once we have found the transmission value \(\varvec{t^{our}}\), we can refine the value of \(\varvec{A^{our}}\) via a least squares approach. In this case, as stated above, we perform the least squares minimisation over the pixels of the image for each of the three colour channels. Mathematically,

$$\begin{aligned} \forall (c) A^{our}_c=arg min_{A^*_c} \Vert (\varvec{1}-\varvec{t^{our}})\cdot A^*_c-\varvec{I_{\cdot ,c}}+\varvec{t^{our}} \odot \varvec{J_{\cdot ,c}} \Vert _2. \end{aligned}$$
(5)

In this case \(\varvec{J_{\cdot ,c}}\) and \(\varvec{I_{\cdot ,c}}\) are N-by-1 vectors representing each different colour channel of the images -i.e. \(c=\{R,G,B\}\)-, N is the number of pixels, and \(\varvec{1}\) is also a N-by-1 vector that has 1 at every entry.

This minimisation leads to

$$\begin{aligned} \forall (c) A_c^{our}=((\varvec{1}-\varvec{t^{our}})^T(\varvec{1}-\varvec{t^{our}}))^{-1}((\varvec{1}-\varvec{t^{our}})^T(\varvec{I_{\cdot ,c}}-\varvec{t^{our}} \odot \varvec{J_{\cdot ,c}} )) \end{aligned}$$
(6)

where \(^T\) denotes the transpose of the vector.

Once this new \(\varvec{A^{our}}\) is obtained, we can keep the iterative approach going by further refining the previous \(\varvec{t^{our}}\) following again Eq. 3.

Finally, once the desired number of iterations are performed, and given \(\varvec{t^{our}}\), \(\varvec{A^{our}}\), and the original hazy image \(\varvec{I}\), we can obtain our output image \(\varvec{J^{our}}\) by solving for Eq. 1:

$$\begin{aligned} \varvec{J^{our}_{x,\cdot }}=\frac{\varvec{I_{x,\cdot }}-\varvec{A^{our}}}{t^{our}_x}+\varvec{A^{our}}. \end{aligned}$$
(7)

We want the reader to raise attention to the relation of this approach to the Alternative Least Squares (ALS) method introduced by Finlayson et al. [8]. As in the ALS method, we are following an iterative procedure for the minimisation of a norm-based function.

3 Experiments and Results

This section is divided into three parts. First, we show qualitative results for our approach when applied to different non-physical dehazing methods. This is followed by a quantitative analysis of our post-processing. The section ends with a subjective evaluation using a preference test. In all our results we have allowed our approach to perform 5 iterations, as we have found experimentally that they are enough to obtain stable results. We have initialised the iterative approach by supposing \(\varvec{A^{our}}=[1,1,1]\).

3.1 Qualitative Evaluation

In all the following figures, we show on the left the original hazy image, on the center the result of the selected dehazing method, and on the right the result obtained by our method.

Fig. 2.
figure 2

Results of our post-processing for the EVID method. From left to right: Original hazy image, result of the EVID method, result after our post-processing.

Fig. 3.
figure 3

Results of our post-processing for the FVID method. From left to right: Original hazy image, result of the FVID method, result after our post-processing.

Fig. 4.
figure 4

Results of our post-processing for the DEFADE method. From left to right: Original hazy image, result of the DEFADE method, result after our post-processing. (Color figure online)

Fig. 5.
figure 5

Results of our post-processing for the Wang et al. method. From left to right: Original hazy image, result of the Wang et al. method, result after our post-processing. (Color figure online)

Figure 2 shows the results for the EVID method [11]. We can see that the original method is inducing an odd increase of contrast in the nearby objects of the image, therefore provoking these objects to look unnatural (e.g. the nearby plants in the top image, and the gravestones in the bottom image). These problems are clearly alleviated in our results.

Figure 3 shows the results for the FVID method [12]. The biggest problem of this image dehazing method is the appearance of artefacts (located in the base of the bushes in the top image and in the sky in the bottom image). Also, the top image is clearly presenting an excessive unnatural contrast. All these problems are suppressed by our proposed approach.

Figure 4 shows the results for the DEFADE method [6]. This method over-enhances the colours, as it can be clearly seen in the green of the plants in the top image, and in the orange hue of the boy’s jacket in the bottom one. Once again, these problems are solved after applying our proposed post-processing.

Finally, Fig. 5 shows the results for the method of Wang et al. [26]. In this particular case, images present an unreasonable contrast. This fact provokes the appearance of unrealistic edges and colours (focus on the green of the grass and the closer bushes in the top image, and on the wall of the nearby building in the bottom image). Once again, these problems are mitigated once our method is applied.

Fig. 6.
figure 6

Original images used in both the quantitative evaluation and the preference test.

3.2 Quantitative Evaluation

For this subsection, we have selected six standard hazy images that appear in most of the works dealing with image dehazing. They are shown in Fig. 6. Regarding the non-physical dehazing methods to be evaluated, we have selected the following five: the FVID [12], the DEFADE [6], the method of Wang et al. [26], and the use of the DehRet method by [10], considering as Retinex the variational approach of SCIE [9] and the Multiscale Retinex (MSCR) method. [22].

We have computed two different image quality metrics in order to evaluate our results: the Naturalness Image Quality Evaluator (NIQE) [20], and the BRISQUE metric [19]. We have selected these metrics as we do not have access to corresponding ground-truth (fog-free) images. Let us note that in the case there are ground-truth images available further metrics can also be considered [14].

NIQE is an error metric that states how natural an image is (the smaller the number, the higher the naturalness). Table 1 presents the mean and RMS results for this metric. Our method improves in all the cases except for the FVID method. In this last case, the mean for the original dehazing method and the mean for our approach is the same, and the RMS for our approach is slightly worse than the one for the original dehazing.

BRISQUE is a distortion-based metric that also tries to predict if an image looks natural based on scene statistics (the smaller the value, the better the result). Table 2 presents the results for this metric. In this case, our method outperforms all the others for all the cases.

Table 1. Mean and RMS results for the NIQE measure.
Table 2. Mean and RMS results for the BRISQUE measure.

3.3 Preference Test

We have also performed a preference test with the same set of images used in the previous subsection. In total, 7 observers completed the experiment. All observers were tested for normal colour vision. The experiment was conducted on a NEC SpectraView reference 271 monitor set to ‘sRGB’ mode. The display was viewed at a distance of approximately 70 cm so that 40 pixels subtended \(1^{\circ }\) of visual angle. Stimuli were generated running MATLAB (MathWorks) with functions from the Psychtoolbox. The experiment was conducted in a dark room.

Subjects were presented with three images: in the center the original hazy image, and at each side the result of the original dehazing method and the result of our post-processing approach. Let us note that the side for these two images was selected randomly, and therefore varied at each presentation. Subjects were asked to select the preferred dehazed image. The total number of comparisons was 30.

Results have been obtained following the Thurstone Case V Law of Comparative Judgement. Figure 7 shows the results for the whole set of comparisons (i.e., considering the 5 original dehazing methods together). We can clearly see that our method statistically outperforms the original dehazing methods.

Fig. 7.
figure 7

Result of the preference test.

A more detailed analysis that looks individually at each dehazing method is presented in Fig. 8. We can clearly see that our method greatly outperforms the results of the DEFADE, the Wang et al., and the DehRet-MSCR methods. In the case of the FVID and the DehRet-SRIE methods, our method is statistically equivalent to the original method. Let us note that these results are well aligned with those obtained on the previous subsection, as the two methods that are statistically equivalent to our post-processing were also the two methods for which our improvement in the metrics was smaller.

The results shown lead us to conclude that our method is very reliable, both quantitatively and subjectively: It does not output a result that deteriorates from the original dehazing method result. Also, let us note that we can not hypothesise which is the best original method, as no direct subjective comparison among them was performed. However, we can hypothesise that FVID and DehRet-SRIE are closer to follow the physical model as our method does not present a significant improvement over them.

Fig. 8.
figure 8

Results of the preference test splited per method.

4 Conclusions

We have presented an approach to induce a physical behaviour to non-physical dehazing methods. Our approach is based on an iterative coupling of the colour channels, which is inspired by the Alternative Least Squares (ALS) method. Results show that our approach is strikingly promising. As further work, we will perform larger experiments with more images and subjects, will consider other evaluation paradigms (e.g. SIFT-based comparison [1]), and will study the convergence of our iterative scheme.