Keywords

1 Introduction

It is challenging to detect infrared (IR) small targets due to several reasons. First, an infrared small target only occupies a few pixels in an image since the detection distance is long [3, 13]. Second, the target has a point spread characteristic due to reflection, refraction, and sensor aperture diffraction [23, 24]. Besides, the intensity and shape of a small IR target can be changed under different seasons, weathers, and time [8, 12]. In addition, sunlight reflections can also be caused by ocean and cirrus clouds. Moreover, broken cloud and cloud edges are always the main causes for false alarms in infrared small target detection [9, 21].

Recently, the progress in Human Visual Systems (HVS) has been widely used to improve the performance of small IR target detection. According to the HVS attention mechanism, the local contrasts between targets and their surrounding backgrounds are more important than the absolute intensities of visual signals in an attention system [19]. In the literature, several local contrast measures have been proposed to imitate the HVS selective attention mechanism [2, 4, 6, 7, 11, 16,17,18, 22]. These measures have shown great potential in infrared small target detection. For example, Chen et al. [2] proposed a Local Contrast Map (LCM) for local target enhancement and background clutter suppression. Han et al. [11] proposed an improved LCM (ILCM) to enhance the detection rate. Wei et al. [22] produced a Multiscale Patch-based Contrast Measure (MPCM) for small target enhancement and background clutter suppression, although it is able to simultaneously detect bright and dark targets in IR images, some discrete points still remain in heavy clutters. Deng et al. [4] introduced a Novel Weighted Image Entropy (NWIE) measure using multiscale gray-level difference and local information entropy. It focuses on the suppression of cloud edges. Nasiri et al. [16] recently proposed a performance-leading Variance Difference (VARD) based method. However, the detection performance is still limited by its fixed-size sliding window.

Inspired by the multiscale gray difference used in [4, 20], we propose a joint filter using multiscale gray and variance difference to improve VARD. Two major contributions of this work can be summarized as follows.

  1. (1)

    A maximum contrast measure is used to extract the maximum cross-scale gray difference. Meanwhile, the optimal size map of the internal window is obtained for subsequent use.

  2. (2)

    A revised multiscale variance difference measure is designed to alleviate the impact of the background fluctuation and optimize the calculation of variance in each internal window.

The rest of this paper is organized as follows. In Sect. 2, we review the VARD method. In Sect. 3, we describe our proposed MGVD joint filter. In Sect. 4, several experiments are conducted to test our method. The paper is concluded in Sect. 5.

2 The VARD Method

Local contrast has been widely used in HVS inspired IR small target detection [2, 16, 18]. Nasiri et al. [16] recently proposed the VARD method for small target detection. In VARD, a fixed-size sliding window with three windows is first extracted from an IR image, as shown in the right part of Fig. 1. The sizes of the internal, middle and external windows are set to \(7\times 7\), \(11\times 11\), \(15\times 15\), respectively.

Given a target with Gaussian shape in an IR image, the target is usually brighter than its surroundings. Therefore, when a target exists in an internal window, the intensity in the internal window is higher than that in the middle window.

$$\begin{aligned} P_{rem}(x_{0},y_{0})=M_{in}-M_{mid} \end{aligned}$$
(1)

where \(M_{in}\), \(M_{mid}\), \(P_{rem}\) represent the mean values of the internal window, the mean values of the middle window and their difference, respectively. \((x_{0},y_{0})\) represents the central pixel under investigation.

Fig. 1.
figure 1

Target with its surrounding regions.

Note that, some areas in an IR image with strong evolving clouds are similar to target regions, therefore, the variance difference between the internal and external window around the investigated image patch is calculated as follows.

$$\begin{aligned} V\!ARD=V_{in}-V_{e} \end{aligned}$$
(2)
$$\begin{aligned} M_{V\!ARD}=\frac{1}{D_{in}^{2}-1}\sum _{j=1}^{D_{in}^{2}-1}V\!ARD_{j} \end{aligned}$$
(3)

where \(V_{in}\) and \(V_{e}\) represent the variance of the internal and external windows, respectively. \(D_{in}\) is the size of the internal window, \(D_{in}^{2}-1\) is the number of neighbors of an image patch.

In fact, the size of a small target can range from \(2\times 2\) to about \(9\times 9\) pixels. In the ideal case, the size of the internal window should be the same as the target size. To deal with this problem, LCM [11], MPCM [22], NWIE [4] define several multiscale sliding windows to match the size of a real target. The VARD method achieves a state-of-the-art detection performance and efficiency. However, the intensity and variance estimation of the internal window defined in Eqs. 1 and 2 is inaccurate as its internal window has a fixed size. Besides, the number of neighboring image patches for the calculation of variance difference (Eq. 3) is insufficient. That is because it only considers the situations when a sliding window enters a target region, but do not consider the situations when a sliding window leaves a target region, as shown in Fig. 2. These factors decrease the accuracy of local gray and variance difference, and finally affects the detection rate and false alarm rate of the algorithm. Therefore, MGVD is proposed to improve VARD.

3 MGVD-based Small Target Detection

In this section, we introduce a new multiscale IR small target detection method to improve VARD.

3.1 Multiscale Gray Difference

As demonstrated in literature [4, 14, 15], an IR small target has a signature discontinuous with its neighborhood (as shown in Fig. 1). In this paper, multiscale gray difference is presented to measure the dissimilarity of a target region from its surrounding areas. For an image I, the kth gray difference at point (x, y) can be formulated as:

$$\begin{aligned} D(x,y)=\left| \frac{1}{N_{\varOmega _{k}}}\sum _{(x,y)\in \varOmega _{k}}I(x,y)-\frac{1}{N_{\varOmega _{max}}}\sum _{(p,q)\in \varOmega _{max}}I(p,q)\right| ^{2} \end{aligned}$$
(4)

where \(k=1, 2, \ldots , K\), which corresponds to the variable sizes of the internal window \(3\times 3\), \(5\times 5\), \(\ldots \), \((2K+1)\times (2K+1)\). The set \(\varOmega _{k}\) denotes the pixels contained in the internal window, the set \(\varOmega _{max}\) denotes the pixels contained in the maximal neighboring area (corresponding to the middle window). I(x, y) and I(p, q) represent the gray value at point in \(\varOmega _{k}\) and \(\varOmega _{max}\), \(N_{\varOmega _{k}}\) and \(N_{\varOmega _{max}}\) are the number of pixels in sets \(\varOmega _{k}\) and \(\varOmega _{max}\). K is the number of variable internal windows.

Using different sizes of the internal windows, we can obtain a set of corresponding gray difference \(D_{k}(x,y)\). Then, the maximum difference measure \(D_{max}(x,y)\) at point (x, y) is

$$\begin{aligned} D_{max}(x,y)=\max \{D_{1}(x,y), D_{2}(x,y), \ldots , D_{K}(x,y)\} \end{aligned}$$
(5)

Consequently, we can obtain the maximum contrast map (i.e., \(D_{max}\)) between the internal window and the middle window.

3.2 Multiscale Variance Difference

Some areas in an IR image with strong evolving clouds are similar to target regions, therefore, using gray difference only is insufficient to extract a target. In addition, the grayscale value in the middle window may be affected by the target, we further consider the variance difference between different internal and external windows of its neighboring image patches.

Fig. 2.
figure 2

A sliding neighboring internal window (blue square). (Color figure online)

Different from VARD, we increase the number of neighboring image patches for the calculation of variance difference, as illustrated in Fig. 2. In our method, the size of the internal window is set to \(D_{in}^{*}(x,y)\), which corresponds to the maximum difference measure in Eq. 5. The number of neighboring image patches is \(D_{2}=2D_{in}^{*}-1\). Finally, the multiscale variance difference can be calculated as:

$$\begin{aligned} V\!ARD^{'}=V_{in}^{'}-V_{e_{j}} \end{aligned}$$
(6)
$$\begin{aligned} M_{V\!ARD^{'}}=\frac{1}{D_{2}^{2}-1}\sum _{j=1}^{D_{2}^{2}-1}V\!ARD_{j}^{'} \end{aligned}$$
(7)

where \(V_{in}^{'}\) represents the variance of the internal window, \(V_{e_{j}}\) represents the variance of the external window in neighboring image patches, \(V\!ARD^{'}\) represents our revised variance difference for a single image patch. Consequently, the multiscale variance difference is calculated as:

$$\begin{aligned} MGV\!D=D_{max}\odot M_{V\!ARD^{'}}^{2} \end{aligned}$$
(8)

where \(\odot \) means the Hadamard product.

3.3 MGVD-based Small Target Detection

The proposed algorithm has five major steps, a flow chart is shown in Fig. 3. First, image patches with three windows are first extracted from an IR image. Second, the maximum contrast measure between the internal window and the middle window is calculated on each image patch. Third, the variance difference is calculated between the internal window and its surrounding background in the external windows. Fourth, the multiscale gray difference map is multiplied with the multiscale variance difference map. Finally, we used the same adaptive-threshold segmentation method as [1, 2, 10] to extract candidate targets. The threshold is computed according to

$$\begin{aligned} T=\mu +k\times \sigma \end{aligned}$$
(9)

where \(\mu \) and \(\sigma \) are the mean and standard deviation of the final enhanced map, respectively. In our experience, k ranges from 2 to 15.

Fig. 3.
figure 3

Overview of our proposed MGVD small target detection method.

4 Experimental Results and Analysis

To test the performance of our proposed method, qualitative and quantitative experiments are presented in this section.

4.1 Experimental Setup

To demonstrate the effectiveness of our proposed method, two real IR image sequences with heavy clutters are tested. Example images are shown in Fig. 4 and the details of these two real sequences are summarized in Table 1.

Fig. 4.
figure 4

Original images.

Table 1. Details of 2 real IR image sequences.

Five recent HVS-based single frame target detection methods have been used as baseline methods, including Average Gray Absolute Difference Maximum Map (AGADM) [20], LCM [2], NLCM [18], NWIE [4], and VARD [16]. LCM is a traditional HVS-based local contrast method, AGADM and NWIE are two multiscale gray difference based methods, NLCM and VARD are two joint target detection methods using both grayscale and variance.

Three evaluation criteria have been used to measure the target enhancement and background suppression performance, including Signal to Clutter Ratio Gain (SCRG) and Background Suppression Factor (BSF) and Receiver Operating Characteristic (ROC) curves [4, 18, 25]. They are defined as:

$$\begin{aligned} SCRG=\frac{S_{out}/C_{out}}{S_{in}/C_{in}} \end{aligned}$$
(10)
$$\begin{aligned} BSF=\frac{C_{in}}{C_{out}} \end{aligned}$$
(11)

where \(S_{in}\) and \(S_{out}\), \(C_{in}\) and \(C_{out}\) are the amplitude of target signal and the standard deviations of clutter in the input and output images, respectively.

A ROC curve represents the relationship between the probability of detection and false alarm rate. Specifically, for a given threshold T in Eq. 9, the probability of detection \(P_{d}\) and false alarm rate \(P_{f}\) [3, 5, 16] can be calculated as:

$$\begin{aligned} P_{d}=\frac{n_{t}}{n_{c}} \end{aligned}$$
(12)
$$\begin{aligned} P_{f}=\frac{n_{f}}{n} \end{aligned}$$
(13)

where \(n_{t}\), \(n_{c}\), \(n_{f}\) and n represent the number of detected true pixels, ground-truth target pixels, false alarm pixels and the total number of image pixels, respectively.

Fig. 5.
figure 5

Target enhancement results obtained by different methods on Sequence 1. Real targets are shown in red rectangles, with a close-up version shown in the left bottom part of each figure. (Color figure online)

Fig. 6.
figure 6

Target enhancement results obtained by different methods on Sequence 2, Real targets are shown in red rectangles, with a close-up version shown in the left bottom part of each figure. (Color figure online)

4.2 Qualitative Results

The target enhancement and detection results achieved by different methods on the two sequences are shown in Figs. 5 and 6. It can be seen that the image processed by our method has less clutter and residual noise under different clutter backgrounds as compared to the baseline methods. That is attributed to the adaptive calculation of grayscale maximum contrast measure and variance difference in each image patch. The AGADM and LCM methods are inferior to the other four methods in background suppression. Although the NLCM and NWIE methods can preserve the target to a certain extent, several strong cloud edges remain in the filtered results of the two image sequences. Since the size of the sliding window in VARD is fixed, the targets with various sizes cannot be optimally enhanced. Therefore, they are missed in some frames in Sequence 1. Besides, cloud edges are also enhanced and still remain in strong evolving background in Sequence 2 after filtered by VARD.

In summary, the above qualitative results demonstrate that the proposed method obtains the best target enhancement and background suppression performance. However, there are still few deficiencies of our MGVD method. For example, when a target is so far away from the imaging system that it only occupies 2–3 pixels in an image, the temporal cues in multiple frames should be used to extract targets.

4.3 Quantitative Results

The average SCRG and BSF results obtained by our method and the baseline methods are shown in Fig. 7 and Table 2. We can find that the SCRGs and BSFs achieved by AGADM and LCM method are relatively low. The VARD method is the second best method in BSF. In contrast, our proposed method removes isolated clutter residuals and preserves the target missed by VARD in strong cloud edges (as shown in Sequence 1). Consequently, our MGVD method obtains the highest scores in both SCRG and BSF, with remarkable background suppression performance being achieved. It is clear that the performance of the NLCM and NWIE methods are poor for the removal of strong evolving cloud edges, especially in Sequence 2.

Fig. 7.
figure 7

The average SCRG and BSF results achieved by different methods on two sequences. (a) The average SCRG results. (b) The average BSF results.

Table 2. The average SCRG and BSF results.
Fig. 8.
figure 8

ROC curves. (a) ROC curves of Sequence 1. (b) ROC curves of Sequence 2.

ROC curves are used to further compare our proposed method to the baseline methods. As illustrated in Fig. 8, it can be seen that the ROC curves of our method on the two real image sequences are close to the upper left corner. That is, our method outperforms other baseline methods in terms of \(P_{d}\) and \(P_{f}\). On Sequence 1, when the false alarm rate is \(2\times 10^{-5}\), our proposed method and VARD can achieve a detection rate of 90%. When the false alarm rate is \(1\times 10^{-4}\), all the methods can obtain a detection rate over 90%, except for LCM. On Sequence 2, when the false alarm rate is \(1\times 10^{-5}\), only our proposed method can obtain a detection rate of \(90\%\).

4.4 Computational Efficiency

All the methods were implemented in Matlab 2014a on a PC with a 2.7 GHz CPU and 4.0 GB RAM. We ran our method on 2 real IR image sequences. The run time on each dataset is 5.73 s and 17.20 s. Since our method uses a sliding window to check all possible locations in an image, it is not very efficient. Its efficiency should be further improved in future.

5 Conclusion

This paper presented a joint filter for small target detection using multiscale gray and variance difference. Maximum gray difference is first extracted by an absolute gray difference method. The optimal size of the internal window is then used to calculate the variance of the internal window. Finally, the neighboring image patches are expanded for the estimation of variance difference. Experiments shows that the proposed method achieves promising target enhancement and background suppression performance on complicated real IR images.