Keywords

1 Introduction

The automatic detection of diatoms is a task of great interest for the scientific community, since it allows to evaluate the quality of water. In this field, the ADIAC project stands out, whose work was pioneering in the recognition of diatoms using image processing techniques [4]. Since its inception, the cell walls detection has been an important task, because it is used to locate and separate diatoms from the background [4, 9, 27]. Currently, it is still a challenge to detect diatoms in water samples that usually include debris and flocs. Therefore, it is a field of great interest that is still being studied [17].

Phase congruency (PC) is a technique used in diatom segmentation [27]. Kovesi was a pioneer in taking the PC to practice in image processing. His contributions corresponded to the possibility of calculating it for two-dimensional signals, lower sensitivity to noise and a good location of the regions in which it occurs [15]. This technique allows to weight the important characteristics of the image, independent of the intensity of the signal with values from zero to one, so it is possible to apply a global threshold level to identify the regions of interest.

In previous work, Jacanamejoy and Forero [13] presented a modification of the PC method, where they expand the bandwidth of the filter bank, in order to improve the detection of very close features, which has as a negative consequence a greater sensitivity to noise. For this reason, the noise estimation technique proposed by Kovesi is inappropriate in this case. Therefore, in this work a new noise estimation method is proposed, that is more effective than the previous one, being appropriate in both situations.

This paper presents in Sect. 2 the PC technique proposed by Kovesi. In Sect. 3 the materials and methods, required to define a new way of calculating the noise level, are presented. The results are analyzed in Sect. 4 and the conclusions in Sect. 5.

2 Phase Congruency

The phase \(\overline{\phi }(x)\) that maximizes Eq. (1), defined in terms of the Fourier components \(A_n\), of a signal in the position x, allows to determine the PC [23]. Each component defines a different \(\eta \) scale, where the smallest scale, \(\eta =1\), is determined by the largest component, \(n=N\). The PC is given by:

$$\begin{aligned} PC(x)= max_{\ \overline{\phi }(x)\in [0,2\pi ]} \frac{\sum _{n=1}^N A_n \cos \left( \phi _n(x)-\overline{\phi }(x)\right) }{\sum _{n=1}^N A_n} \end{aligned}$$
(1)

where \(\phi _n(x)\) corresponds to the phase of the component n. However, Eq. (1) is not easy to implement, and much more so when it comes to images. Therefore, to solve this problem, the local energy E(x) is calculated instead, which is directly proportional to the PC [29]. Thus, the precise calculation of E(x) is essential.

2.1 Local Energy

Equation (2) shows the existing relationship between energy E(x) and PC(x) in one dimensional signals, proposed by [29].

$$\begin{aligned} E(x)=PC(x) \sum _{n=1}^N A_n(x) \end{aligned}$$
(2)

To find the PC from the relationship given in Eq. (2), the calculation of the local energy E(x) and the components \(A_n\) is done, mathematically expressed as

$$\begin{aligned} E(x)= \sqrt{\left( \sum _{n=1}^N e_n(x)\right) ^2+\left( \sum _{n=1}^N o_n(x)\right) ^2} \end{aligned}$$
(3)

and

$$\begin{aligned} A_n(x)= \sqrt{e_n(x)^2 + o_n(x)^2}. \end{aligned}$$
(4)

where \(e_n(x)\) and \(o_n\) are the n-th responses to the odd and even odd quadrature filters respectively [15]. Figure 1a illustrates the geometric relationship between local energy, frequency components and phase congruence and Fig. 1b the representation of the PC as a triangular inequality, where the \(\sum _n A_n(x)\) is always greater than or equal to E(x).

Fig. 1.
figure 1

(a) Relationship between phase congruency, local energy, and the sum of the Fourier amplitudes, adapted from [15]. (b) Representation of the PC by a triangle inequality where the \(\sum _n A_n(x)\) is always greater than or equal to E(x).

2.2 Calculation of Phase Congruence

The calculation of the components \(e_n(x)\) and \(o_n(x)\) is the computationally most expensive part of the PC method. Initially, Kovesi used wavelet filters to calculate the PC [15], which is inefficient since a large number of filters are required to obtain the different directions and frequency components. A more efficient alternative was proposed by Felsberg using monogenic filters, which result from the extension of the Hilbert transform to two dimensions [6], although its practical application was not immediate. Later, Kovesi implemented a software version of the PC employing monogenic filters. This code is available on the internet and it is compatible with Matlab and Octave [16]. Likewise, Li Juan formalized the methodology for calculating the PC based on monogenic filters [18]. However, although the PC implementation based on these filters is much more efficient, there are still new publications from different areas that employ the PC implementation based on wavelets [2, 5, 24, 28, 30,31,32].

In one-dimensional signals, \(e_\eta (x)\) and \(o_\eta (x)\) are obtained as a result of the convolution of the original signal with a pair of wavelets of even and odd symmetry at the scale \(\eta \) [15], then, the conversion between scales and components is given by \(\eta =N-n+1\).

In 2D signals, it is more appropriate to use monogenic filters. In this case, log-Gabor filters are employed, in which the transfer function is mathematically expressed as:

$$\begin{aligned} G(\omega )= exp\left( \frac{- (\log (\omega /\omega _0))^2}{2(\log (\sigma _o))^2}\right) \end{aligned}$$
(5)

where \(\sigma _0\) defines the filter bandwidth and \(\omega _0\) the central frequency given by:

$$\begin{aligned} \omega _o = \frac{1}{\lambda _{_{min}}m^{N-n}}=\frac{1}{\lambda _{_{min}}m^{\eta -1}} \end{aligned}$$
(6)

\(\lambda _{min}\) is the length of the signal in the space x associated with the filter with the highest frequency, n the n-th frequency component, \(\eta \) the \(\eta \)-th scale and m the scale factor between consecutive filters.

When the log-Gabor filter is extended to two dimensions, a variable change is done, \(\omega =|{\varvec{u}} |\), where \({\varvec{u}}=(u_1,u_2)\) is the two-dimensional frequency, and \(G: \mathbb {R}^2\rightarrow \mathbb {R}\). Then, \(e_n({\varvec{x}})\) is obtained as a result of the convolution of the image \(I({\varvec{x}})\) with the log-Gabor filter of the n-th component \(G_n({\varvec{x}})\), being \({\varvec{x}}=(x_1,x_2)\) the image coordinates, ie,

$$\begin{aligned} e_n({\varvec{x}})=I({\varvec{x}})*G_n({\varvec{x}}). \end{aligned}$$
(7)

Once \(e_n({\varvec{x}})\) is calculated by using the two-dimensional Hilbert transform [7], \(o_n({\varvec{x}})\) is found from the equation:

$$\begin{aligned} o_n({\varvec{x}})=-\frac{{\varvec{x}}}{2\pi |{\varvec{x}}|^3}*e_n({\varvec{x}}). \end{aligned}$$
(8)

It is important to note that the image \(I({\varvec{x}})\) must not have a zero frequency component. To eliminate it, the technique proposed by [21] is employed. Although \(e_n({\varvec{x}})\) generates values in \(\mathbb {R}\), \(o_n({\varvec{x}})\) can take values in \(\mathbb {R}^2\). Therefore, when using \(o_n({\varvec{x}})\) in Eqs. (3) and (4), only its magnitude should be used, i.e., \(A_n\) and E are redefined as:

$$\begin{aligned} A_n({\varvec{x}})= & {} \sqrt{e_n({\varvec{x}})^2 + \left| o_n({\varvec{x}})\right| ^2}, \end{aligned}$$
(9)
$$\begin{aligned} E({\varvec{x}})= & {} \sqrt{\left( \sum _{n=1}^N e_n({\varvec{x}})\right) ^2+\left( \sum _{n=1}^N \left| o_n({\varvec{x}})\right| \right) ^2}, \end{aligned}$$
(10)

As it can be seen in Eqs. (9) and (10), \(A_n({\varvec{x}})\) and \(E({\varvec{x}})\) only generate real values. Therefore, all the equations that are defined from \(A_n\) and E, that were developed for one-dimensional signals, also works in two dimensions.

In addition to the calculation of \(e_n(x)\) and \(o_n(x)\), the adaptation with monogenic filters [16] has additional changes indicated in Eq. (11). In this way, the calculation of the PC has three factors: a PC weighting factor fixed according to the frequency band W(x), the simple expression to calculate the PC, and a term to avoid the generation of erroneous PCs due to noise.

$$\begin{aligned} PC(x)=W(x).\lfloor 1-\alpha \left| \delta (x)\right| \rfloor .\frac{\left\lfloor E(x)-T\right\rfloor }{E(x)+\varepsilon } \end{aligned}$$
(11)

where,

$$\begin{aligned} W(x)= & {} \frac{1}{ 1+ \exp \left( \gamma (c-s(x)) \right) }, \end{aligned}$$
(12)
$$\begin{aligned} s(x)= & {} \frac{1}{N}\left( \frac{\sum _{n=1}^N A_n(x)}{\varepsilon + A_{max}(x)}\right) , \end{aligned}$$
(13)

W(x) is a sigmoid function, where s(x) is the measure of the dispersion corresponding to the frequency spectrum of different components, c is the cut-off value of the filter response wideband (spread), below which phase congruency values become penalized, and \(\gamma \) is a gain factor that controls the sharpness of the cutoff [15].

The second term of Eq. (11), corresponding to the simple quantization of the PC, generates values between zero and one, only for small values of \(\delta (x)\), the angle of the right triangle formed from the triangular inequality illustrated in Fig. 1b, and \(\alpha \) limits the range of \(\delta (x)\) that is considered, setting the sensitivity of the PC.

The noise compensation, given by the expression \(\left\lfloor E(x)-T\right\rfloor /(E(x)+\varepsilon \), causes the term to be zero when the \(E(x)\le T\). The value of T is set according to the image noise level. The constant \(\varepsilon \) corresponds to a small value that avoids division by zero, and appears in the Eqs. (11) and (13). More detailed information about the first and third factors of Eq. (11) is found in [15], and for the second in [16].

2.3 Noise Estimation

The implementation of Eq. (11) developed by Kovesi [16], and denoted by \( PC_k (x) \), estimates the noise ratio as \(T= \mu _R+k\sigma _R\), where \(\mu _R\) and \(\sigma _R\) are the mean and variance of the magnitude of the energy noise vector and k is a factor empirically found to adjust T depending on the case. Kovesi [15] fixes the energy image noise magnitude distribution to a Rayleigh function, to find the parameters \(\mu _R\) and \(\sigma _R\).

The probability density function of the Rayleigh distribution is given by:

$$\begin{aligned} R(x;\sigma _G)=\frac{x}{\sigma ^2_G} \exp \left( \frac{-x^2}{2\sigma ^2_G}\right) , \end{aligned}$$
(14)

where the mean and variance are given by:

$$\begin{aligned} \mu _R=\sigma _G\sqrt{\frac{\pi }{2}}, \text { and } \sigma _R^2=\frac{4-\pi }{2}\sigma _G^2, \end{aligned}$$
(15)

and \(\sigma _G\) is the mode of the Rayleigh distribution.

Kovesi proposes to calculate \(\sigma _G\) by using the mode of the component \(A_N(x)\), denominated \(\tau \) [14]. By performing a simple approach, the estimated noise response at successive filters is scaled, inversely proportional to the bandwidth. In this way, \(\sigma _G\) can be expressed by a simple geometric sum:

$$\begin{aligned} \sigma _G= \sum _{n=0}^{N-1}\tau \left( \frac{1}{m} \right) ^n= \tau \frac{1-(1/m)^N}{1-(1/m)}, \end{aligned}$$
(16)

In his webpage, Kovesi proposes two different ways of calculating \(\tau \) [16]. In the first one, it corresponds to the mode of \(A_N(x)\), as it was mentioned before. In the second one, it is assumed that \(A_N(x)\) fixes a Rayleigh distribution, so that:

$$\begin{aligned} \tau = \frac{{\text {median}}(A_N(x))}{\sqrt{\ln (4)}}, \end{aligned}$$
(17)

This second form is more robust, since it does not depend on a single value of \(A_N(x)\), but on the data set. Thus, to find an adequate value of the noise threshold T, the mode of \(A_N(x)\) and the scale factor m between successive filters are taken into account. Therefore, according to (15) and (16) \(\tau \), m and k define T.

It is important to clarify that estimating adequate values of T is possible if the following three assumptions are done: image noise is additive, a constant noise power spectrum, and features, as edges, occur only at isolated locations of the image [15]. This latter requirement may not be as strict as the parameters \(\sigma _o\) and \(\alpha \) are adjusted, but since there is a compromise between the sensitivity of the PC to nearby edges and the response to noise, the choice of these parameters is not trivial [13]. For this reason, a new method to find a more accurate T to rule out noise is proposed.

3 Weibull Distribution

In many studies, Gaussian, Rayleigh, and Exponential distributions have been used as image models [1]. For example, the luminance within shadow regions in sonar imagery is well modeled by the Gaussian distribution, while the Rayleigh distribution [25], introduced in 1880, is more accurate in the reverberation regions [20], Martin et al. [19] analyzed level set implementation of region snakes based on the maximum likelihood method for different noise models that belong to the exponential family.

These distributions can be worked from a single distribution called Weibull with probability density function (pdf)

$$\begin{aligned} f(x;\lambda , \zeta )= \frac{\zeta }{\lambda } \left( \frac{x}{\lambda }\right) ^{\zeta -1} e^{-(x/\lambda )^\zeta } \end{aligned}$$
(18)

and cumulative distribution function (cdf)

$$\begin{aligned} F(x)=1-e^{-(x/\lambda )^\zeta } \end{aligned}$$
(19)

where \(\lambda >0\) is the scale parameter and \(\zeta >0\) is the shape parameter of the distribution. The Weibull distribution becomes Exponential or Rayleigh depending on the parameters in Eq. (18). In this way, if \(\zeta =1\), the Weibull distribution is an Exponential one and, if \(\zeta =2\) is a Rayleigh one [8, 22].

It has been proved theoretically and experimentally verified that the first-order derivatives of a wide variety of textures follow a Weibull distribution [1, 10, 12]. A study on 45,000 photographs [12], taken from the Curet database of natural textures [3], revealed that 60% of all images have strict contrast distributions in the form of Weibull and the remaining 40% has a distribution close to Weibull or is highly regular. In 2009, Scholte et al. suggested the brain models the image information like a Weibull distribution [26], i.e., the parameters of a Weibull distribution inform the brain about the spatial coherence of the perceived scene. However, the model is only valid for a limited amount of natural images [12] and the parameters of the Weibull distribution are sensitive to the conditions of the image such as illumination, enlargement of the camera and resolution power, and texture orientation [11]. Still, the Weibull distribution is considered a good model for image segmentation and versatile enough to represent a wide variety of images [1]. Therefore, the Weibull distribution is a general model that be can used to estimate the background in images, i.e., the Weibull distribution can be used to generalize the different gray levels distributions of the noise in an energy image, being its shape parameter very relevant.

4 Material

To develop and initially evaluate the proposed method the Kovesi code [16] was executed in Octave and 253 images of diatoms, taken from the ADIAC database [4], were used, along with a synthetic image shown in Fig. 2a. The algorithm was implemented in Java, as a plugin of the free access software ImageJ.

5 Proposed Method

The information distribution of an image can be determined by its local energy. If the local energy is represented by means of a monochromatic image, information can be quantified in grey levels, where pixels with the highest values correspond to regions with most information (edges, ridges) and pixels close to zero to regions of little interest, where the ratio signal-to-noise is very low and the PC must not be calculated. Two examples of local-energy images and their histograms are shown in Fig. 2c, d, g and h.

5.1 Weibull Approximation and Threshold Estimation

As it can be seen in Fig. 2d and h, the local energy function shape of the background is either Exponential (\(\zeta \approx 1\)), when the image has little noise, or Rayleigh (\(\zeta \approx 2\)), when the image is highly contaminated with Gaussian noise, respectively.

Fig. 2.
figure 2

(a) Sintetic image. (b) PC. (c) Local energy. (d) Histogram of the local energy. (e) Sintetic image with noise. (f) PC. (g) Local energy. (h) Histogram of the local energy.

Once the image representation function is established, it is necessary to find an adequate threshold value to rule out noise variations. As mentioned above, the local energy is higher in the regions of interest with respect to the background. Then, it is possible to determine a threshold value that allows the segmentation of the objects from the background. Thus, the gray level distribution of the local energy image is given by the Weibull distribution that corresponds to the background, and the region of interest that belongs to another distribution, as illustrated in Fig. 2. Therefore, the threshold must be located at an intermediate point between these two distributions.

A quick way to estimate an appropriate threshold, above which phase congruence can be calculated, is to find a value proportional to the gray level mode of the local-energy image histogram. From Eq. (19), we estimate a threshold value depending on the amount of noise we want to eliminate. In this way, we define the threshold level \(x_u\) as a product between the factor p and the mode \(x_m\), i.e., \(x_u=px_m\).

$$\begin{aligned} x_m= & {} \lambda \left( \frac{\zeta -1}{\zeta } \right) ^{1/\zeta } \end{aligned}$$
(20)
$$\begin{aligned} F(x_u)= & {} 1-exp\left( \frac{p^\zeta (1-\zeta )}{\zeta } \right) \end{aligned}$$
(21)

The amount of noise to be discarded taking into account the threshold \(x_u\), according to the (19) is given by (21). Therefore, to discard more than 95% noise, if \(\zeta \approx 1.1\) a factor \(p\approx 25\) is required, and if \(\zeta \approx 2\) then \(p\approx 3\).

Fig. 3.
figure 3

(a) Diatom image. (b) Local energy. (c) Histogram of local energy.(d) PC by noise estimation with Kovesi algorithm. (e) PC by noise estimation proposed in this work.

6 Results

Figure 3 illustrates the results obtained by using Kovesi’s method and the technique proposed in this work. In this example, since the background local-energy histogram shape, shown in Fig. 3a, approximates an exponential distribution, a factor \(p=25\) was employed for noise estimation. Other examples are shown in Fig. 4, where the factor values p were chosen according to the background local-energy shape. As can be seen, the proposed method allows obtaining better results. Diatoms are well detected, edges are better defined with a lower amount of noise.

Fig. 4.
figure 4

(a) Diatom image. (b) PC by noise estimation with Kovesi algoritm. (c) PC with \(p=10\). (d) Diatom image. (e) PC by noise estimation with Kovesi algoritm. (f) PC with \(p=4\). (g) Horizontal profile of (b) and (c). (h) Horizontal profile of (e) and (f).

7 Conclusions

Phase congruency is a powerful image processing technique for segmentation. However, an important limitation is its sensitivity to noise. Therefore, to prevent noise from affecting segmentation results, a new noise level estimation method was proposed. The Weibull distribution of the local energy image was employed to estimate the noise profile and a threshold was proposed as a proportion of the local energy histogram mode. The results showed that the Weibull distribution allows a better estimation of the noise level.

By means of the proposed technique, it is possible to accurately identify the amount of noise to be discarded. Additionally, it allows estimating the compromise between the percentage of the region of interest to be discarded against the noise to be handled.

It is also worth mentioning that, in future, an automatic way to find the threshold value could be found by effectively calculating the noise shape parameter from the local-energy image histogram.