Keywords

1 Introduction

The noise signal is a pollution signal, which will seriously affect the image quality of the video picture, bring a lot of difficulties to the follow-up process of the video image and affect people’s visual experience of video images, the traditional video image noise reduction algorithms are implemented based on the known noise levels [1,2,3], the accurate noise level estimation is very necessary and can further improve the performance of noise reduction algorithm. The current noise estimation algorithm can be divided into three types of space domain, transform domain and matrix domain.

The spatial noise estimation algorithm is to deal with the noisy image directly and mainly dependent on the weak texture area of the image for the estimation of the noise variance, which is divided into the image block based noise estimation, the filtering based noise estimation as well as mixed noise estimation. Liu et al. [4] have proposed based on the image block based noise estimation, which uses the gradient covariance matrix to solve the weak texture region of the image and estimate the noise level of the weak texture region, which the estimation is only for the Gaussian noise level. Pei et al. [5] proposed a filtering based noise estimation method, which processes the texture details of the image through an adaptive filter and estimates the noise level combining the noisy image blocks and their filtered image blocks. For the noise estimation in the transform domain, Ponomarenko et al. [6] proposed the noise estimation algorithm based on DCT transform, which transforms the image into the domain with DCT transform to field to separate the signal and noise, achieves a more accurate noise estimation in dealing with some simple images with less textures. Based on Yu, Li et al. [7] proposed the method of noise estimation relying on test, and made noise estimation by reducing the value of the image by wavelet transform, as well as the relationship between the noise signal and its variance. Donoho [8] proposed to use the wavelet soft value for noise estimation, made mean absolute deviation (MAD) in the HH sub-band of the wavelet transform and estimated the noise standard deviation.

Noise estimation of space and transform domains are both requiring the image block is smooth enough, but in many cases, the given video image contains a lot of random textures, then the above-mentioned noise estimation algorithms cannot deal with the video images with more complex textures. While the matrix domain noise estimation uses the idea of matrix decomposition to distinguish the video image signal and the noise signal in the matrix domain, so it also applies to the video images with more complex textures [9]. Matrix domain noise estimation is mainly divided into two types of the principal component analysis and singular value decomposition.

The principal component analysis (PCA) algorithm based on image blocks proposed by Pyatykh et al. [10] can not only deal video images with a smooth area, but get a more accurate level of noise for the video images with more complex textures. Liu and Lin [11] proposed a noise estimation algorithm based on singular value decomposition (SVD) to estimate the noise level by the singular value decomposition of the noisy video image, while the above noise estimation algorithms are only for the Gaussian noise estimation, and the estimation effect of sensor noise, i.e. Poisson-Gaussian noise, is not ideal.

Based on PCA, this paper combines variance stability transformation (VST) to estimate the noise of the noisy images, which can estimate not only the simple Gaussian noise but the sensor noise level. Meanwhile, the concept of excessive peak is introduced to further determine the accuracy of the mixed noise estimation parameters, and estimate the parameters of the VST transform with the excessive peak of the minimum noise distribution. The subjective and objective results have shown that the combination of the noise estimation algorithm in this paper and the classical video ablation algorithm has achieved good results and made the video denoising application gain a wider range of applications.

2 Related Works

2.1 PCA Noise Estimation

For the known model in image block, we need to select the appropriate image block for the main component analysis. First define a positive integer m. When all the image blocks corresponded by the original video image xi are all in the sub-space \( V_{M - m} \in {\mathbb{R}}^{M} \), the information of the original video image x has a redundancy characteristic, where the dimension M–m of the subspace is smaller than the vector dimension M, so we are select the desired image block based on this assumption and the standard selection formula as follows:

$$ d_{i} = Dis(x_{i} ,V_{M - m} )\,\,\,\,\,\,\,\,\,\,\,\,i = 1 \ldots \ldots .K $$
(1)

In Eq. (1), Dis (\( \bullet \)) represents the distance, that is, when the distance between the image block xi and the subspace \( V_{M - m} \in {\mathbb{R}}^{M} \) meet a certain range, the image block xi is determined as the appropriate image block.

Since the noise signal and the video image signal are independent of each other, thus the variance of the original image block, the noise variance and the variance of the noisy image block have the following relationship:

$$ s(x_{i} ) = s(y_{i} ) - \sigma^{2} \,\,\,\,\,i = 1 \ldots \ldots .K $$
(2)

According to the above formula, \( \sigma \) is standard deviation, we can know that he noisy image block yi is positively correlated with the image block distance di. Thus we can eventually be select the appropriate image block for principal component analysis based on the standard deviation of the noisy image block.

After the eigenvalue decomposition, the main constituent of the vector Y is obtained as \( \overline{v}_{Y,i}^{T} Y \), and satisfies the following relationship:

$$ s^{2} (\overline{v}_{Y,i}^{T} Y) = \overline{\lambda }_{Y,i} \,\,\,\,\,i = 1,2, \ldots .,K $$
(3)

Where \( s^{2} \) represents the sample variance, \( \overline{\lambda }_{Y,i} \) represent the eigenvalues of the covariance matrices SY, respectively, the corresponding eigenvectors are \( \overline{v}_{Y,i} \). According to the selection criteria of the image block and vector Y composed by the selected image block, m characteristic values satisfies the following relationship:

$$ E(\left| {\overline{\lambda }_{Y,i} - \sigma^{2} } \right|) = o(\sigma^{2} /\sqrt N )\,\,\,\,\,\,\,N \to \infty $$
(4)

Where i has the range of [M–m + 1, M]. Since the original video image signal and the noise signal are independent of each other, the covariance matrix of X and N is 0, then the following formula holds:

$$ \begin{aligned} & S_{Y} = S_{X} + S_{N} \\ & S_{N} = \sigma^{2} I \\ \end{aligned} $$
(5)

Where S Y , S X , and S N represent the covariance matrix of Y, X, and N, respectively. As the m minimum eigenvalues of the covariance matrix SX are zero, the corresponding m m minimum eigenvalues of SY is \( \sigma^{2} \). When the number of samples N is large, the following formula is satisfied:

$$ \mathop {\lim }\limits_{N \to \infty } E(\left| {\overline{\lambda }_{Y,M} - \sigma^{2} } \right|) = 0 $$
(6)

As Formula (6) shoes, when the number of samples is large enough, the minimum eigenvalue of the sample covariance matrix S Y for the noisy video image is approximately equal to the variance of the noise signal. Therefore, we can calculate the eigenvalues of the sample covariance matrix for the noisy image to approximate the estimated noise variance.

2.2 VST-Based Noise Estimation

Noise estimation algorithm based on the principal component analysis (PCA) is only for the estimation of noisy images of Gaussian noise, and has a poor effect on of treating the sensor noise, that is, Poisson-Gaussian noise, based on this, the noise estimation of noisy images combining variance stability transformation (VST) could not only estimate the simple Gaussian noise, but estimate the sensor noise level, with a relatively accurate estimation.

The following equation is obtained by solving the variance of the noisy image model:

$$ \text{var} (y(p)) = a^{2} \lambda (p) + b\mathop = \limits^{{}} ax(p) + b $$
(7)

From the above equation, Parameter a represents the multiplicative factor, the noise variance of the noisy image is linearly related to the original image pixel value.

According to the characteristics of Poisson distribution, when λ(p) is large enough, ω(p) approximately obeys the normal distribution with mean λ(p) and variance of λ(p). We know that the pixel image y(p) approximately obeys the normal distribution with mean x(p) and variance of \( ax(p) + b \), the sensor noise can be approximated as the additive white Gaussian noise, and the following relationship is satisfied between the pixel image y(p) and the original image pixel value x(p):

$$ y(p) \approx x(p) + \sqrt {ax(p) + b} \xi (p) $$
(8)

The above equation is to solve the noise level of the noisy image, and firstly we must solve the noise parameters a, b. Therefore, the solution of the noise level according to the noise model is turned into the parameter of the noise model.

Define f(y(p); a, b) as a function of the noisy image y(p), that is, the variance stability transformation function, the transformed image f(y(p); a, b) is independent of the original image, and the definition and its standard deviation are as follows:

$$ std(f(y(p);a,b)) = \sigma $$
(9)

A first-order Taylor expansion for the transformed image f(y(p); a, b) is performed at x(p), with the expression as follows:

$$ f(y(p);a,b) \approx f(x(p);a,b) + f^{'} (x(p);a,b)(y(p) - x(p)) $$
(10)

According to the expanded formula, we can get the approximate expression of the formula (9) as follows:

$$ f^{'} (x(p);a,b) \cdot std(y(p)) = \sigma $$
(11)
$$ f^{'} (x(p);a,b) = \frac{\sigma }{std(y(p))} = \frac{\sigma }{{\sqrt {ax(p) + b} }} $$
(12)

Solve the integral on n both sides of the formula (12) to get the following expression:

$$ f(t;a,b) = \frac{2\sigma }{a}\sqrt {at + b} $$
(13)

In the above equation, t represents a random variable, that is, the ariance stability transformation of random variable t. Variance stability transformation (VST) is a smoothing function, the noisy image y(p) approximately obeys the normal distribution, therefore, when y(p) has a sufficiently small variance, the transformed image f(y(p); a, b) also approximately obeys the normal distribution, that is, for all pixel values, \( std(f(y(p);a,b)) \approx \sigma \) holds. So for the transformation of the image, the noise signal can be approximated as the additive white Gaussian noise.

2.3 PCA-Based Image Block Transformation

As the transformation characteristics of the above image variance stability show, the noise signal of the transformed image f(y(p); a, b) is approximated as the additive Gaussian white noise signal with a standard deviation of \( \sigma \), so the transformed image can be considered as the original noise-free image Z by adding noise, therefore, the expectation of transforming the image is the noise-free image Z, with the expression as follows:

$$ E(f(y(p);a,b)) = Z $$
(14)

Define N as the number of image blocks in the transformed image, K is the size of the image block, the image block is transformed into a vector of size K by removing the unnecessary elements of each image block, where the N vectors of the transformed image are v1, … vN and the N vectors of the image Z are u1, … u N . In order to effectively separate the noise signal from the image signal, the image Z is assumed to have the redundant characteristics, that is, the dimensions of the vectors u1, … u N are less than K, and use PCA to show as follows:

  1. (1)

    Calculate the mean vectors of vectors v1, … v N , with the calculation formula as follows:

    $$ \overline{v} = \frac{1}{N}\sum\limits_{i = 1}^{N} {v_{i} } $$
    (15)
  2. (2)

    Calculate the sample covariance matrix of vectors v1, … v N , with the calculation formula as follows:

    $$ S = \frac{1}{N - 1}\sum\limits_{i = 1}^{N} {(v_{i} - \overline{v} )(v_{i} - \overline{v} )^{T} } $$
    (16)
  3. (3)

    We get the normalized eigenvectors a1, … aK of the sample covariance S, and these eigenvectors form a set of orthogonal bases and obey the following relations:

    $$ s^{2} (a_{1}^{T} v_{i} ) \ge s^{2} (a_{2}^{T} v_{i} ) \ge \cdots \ge s^{2} (a_{K}^{T} v_{i} ) $$
    (17)

    Where \( s^{2} ( \cdot ) \) represents the sample variance.

  4. (4)

    The expression for the weight calculation is as follows:

    $$ \omega_{k,i} = a_{k}^{T} (v_{i} - \overline{v} ) \, $$
    (18)

Where k ranges from [1, K], the value range of i is [1, N]. \( \omega_{k,i} \) represents the kth weight of the center vector \( (v_{i} - \overline{v} ) \, \), expressed as \( \omega \), since the distribution of the noise vector is a multivariate Gaussian distribution, its expression is as follows:

$$ s^{2} (a_{i}^{T} v_{i} ) \approx s^{2} (a_{i}^{T} u_{i} ) + \sigma^{2} $$
(19)

We can know that the sample variance \( s^{2} (a_{k}^{T} v_{i} ) \) is equal to the eigenvalue of the sample covariance matrix S. when the principal component analysis (PCA) utilizes the redundancy of the noise-free image Z, this allows the sample vectors u1, … u N to be linearly represented by the previous M feature vectors, for the last feature vector, the sample vectors u1, … u N are orthogonal. Therefore, the distribution of the weight \( \omega_{K} \) is the same as the noise distribution [5], and in practice, the distribution of the noise signal can be replaced by analyzing the distribution characteristic of the weight \( \omega_{K} \), and from (19) we can know that when \( s^{2} (a_{K}^{T} u_{i} ) = 0 \), the noise variance can be approximated as the weight variance, and the expression is as follows:

$$ s^{2} (\omega_{K} ) \approx \sigma^{2} $$
(20)

3 VST Correction Based on Excessive Peak

In estimating the noise level, we must consider the distribution characters of noise signals, so the authenticity of the noise parameters is unknown, and the parameters a and b obtained by VST deviates from the real parameters, therefore, VST transform may not have the variance stabilizing effect, and the distribution of the noise signal at this time also deviates from the normal distribution category. Therefore, we need to measure the normal distribution of noise signals to evaluate the resulting transformation parameters. Here, we use the excess peak to carry out the detection, and the expression for the excessive peak of the random variable X is expressed as follows:

$$ \gamma_{X} = \frac{{E((X - E(X))^{4} )}}{{E((X - E(X))^{2} )^{2} }} - 3 $$
(21)

In the above formula, when the random variable follows a normal distribution, its excessive peak \( \gamma_{X} \) is zero. Therefore, the reduction of excessive peak is the necessary condition for noises to obey the normal distribution.

For the noisy image expression of the noisy model, its noise obeys the sufficient condition of the normal distribution. Define \( x_{1} < \cdots < x_{M} \) as the pixel value of image \( x \), and the corresponding probabilities are \( h_{1} , \ldots ,h_{M} \), we assume that the parameters in the VST transform are \( {\text{a}}^{'} \), \( {\text{b}}' \), which are not equal to the true noise model parameters a and b. According to the first-order Taylor expansion, the standard deviation of the transformed image \( f(y(p);a^{'} ,b^{'} ) \) can be obtained as follows:

$$ \begin{aligned} std(f(y(p);a^{'} ,b^{'} )) & \approx f^{'} (x(p);a^{'} ,b^{'} ) \cdot std(y(p)) \\ & { = }\;\sigma \frac{{\sqrt {ax(p) + b} }}{{\sqrt {a^{'} x(p) + b^{'} } }} \\ \end{aligned} $$
(22)

According to the above equation, the noise variance of the pixel value \( x_{1} < \cdots < x_{M} \) of the image \( x \) can be obtained. The calculation expression is as follows:

$$ \sigma_{i}^{2} \;{ = }\;\sigma^{2} \frac{{ax_{i} + b}}{{a^{'} x_{i} + b^{'} }} \, i = 1, \ldots ,M $$
(23)

It can be obtained that the transformed image \( f(y;a^{'} ,b^{'} ) \) obeys the normal distribution with the variance of \( \sigma_{i}^{2} \), so in the transformed image \( f(y;a^{'} ,b^{'} ) \), the noise signal can be expressed as a multivariate Gaussian distribution \( N(0,\sigma_{i}^{2} ) \) with the weight of \( h{}_{i} \), and its excessive peak expression is as follows:

$$ \gamma = 3\frac{{\sum\limits_{i = 1}^{M} {h_{i} \sigma_{i}^{4} } }}{{(\sum\limits_{i = 1}^{M} {h_{i} \sigma_{i}^{2} )^{2} } }} - 3 $$
(24)

As Eq. (24) shows, the excess peak is a nonnegative number, only in cases where all \( \sigma_{i} \) are equal, that is, the parameters \( a^{'} ,b^{'} \) and \( a,b \) proportionate, it will be zero, and the formula is as follows:

$$ \frac{a}{{a^{'} }} = \frac{b}{{b^{'} }} $$
(25)

If the above formula holds, then \( f(y(p);a^{'} ,b^{'} ) \) and the original image pixel value \( x(p) \) are independent of each other, that is, the noise signal of the transformed image \( f(y(p);a^{'} ,b^{'} ) \) is the additive white noise. Therefore, the reduction of excessive peak is a necessary and sufficient condition for the additive white gaussian noise.

Assume that the excessive peak of the sample is \( G(X_{i} ) \), the corresponding sample variables are \( X_{1} , \ldots ,X_{N} \), when \( X_{1} , \ldots ,X_{N} \) obey the normal distribution, then the sample over-peak \( G(X_{i} ) \) multiplied by a certain coefficient obeys the standard normal distribution, expressed as follows:

$$ G(X_{i} )\sqrt {{N \mathord{\left/ {\vphantom {N {24}}} \right. \kern-0pt} {24}}} \sim N(0,1) $$
(26)

The above equation indicates that there is a certain threshold value \( T_{\gamma } \), when \( G(X_{i} )\sqrt {{N \mathord{\left/ {\vphantom {N {24}}} \right. \kern-0pt} {24}}} \) is less than the threshold, it obeys the normal distribution, where the threshold \( T_{\gamma } \) is nonnegative.

As shown in the above-described distribution characteristic of the detected noises, we estimate the parameters of the VST transform by minimizing the excessive peak of the noise distribution, in order to effectively minimize the excessive peaks, we make conversion of the parameters a and b into a function of the parameters \( \sigma \) and \( \phi \), and the expression is as follows:

$$ \begin{aligned} a = \sigma^{2} \cos \phi \hfill \\ b = \sigma^{2} \sin \phi \hfill \\ \end{aligned} $$
(27)

Where the parameter \( \sigma \) is a nonnegative number, the value range of parameter \( \phi \) is \( [0,{\pi \mathord{\left/ {\vphantom {\pi 2}} \right. \kern-0pt} 2}] \). It can be derived that the following formula holds:

$$ \begin{aligned} & \frac{\cos \phi }{{\cos \phi^{'} }} = \frac{\sin \phi }{{\sin \phi^{'} }} \\ & \Leftrightarrow \sin (\phi - \phi^{'} ) = 0 \\ & \Leftrightarrow \phi = \phi^{'} \\ \end{aligned} $$
(28)

Taken into the VST transformation formula, the following expression can be obtained:

$$ \begin{aligned} f(t;a,b) & = f(t;\sigma^{2} \cos \phi ,\sigma^{2} \sin \phi ) \\ \, & { = }\frac{2\sigma }{{\sigma^{2} \cos \phi }}\sqrt {t\sigma^{2} \cos \phi + \sigma^{2} \sin \phi } \\ \, & { = }\frac{2}{\cos \phi }\sqrt {t\cos \phi + \sin \phi } \\ \end{aligned} $$
(29)

From the above equation we can see that the VST transform is only related to the parameter \( \phi \), therefore, the estimated value \( a^{'} \) and \( b^{'} \) of the noise model parameters is finally obtained by repeated iterations and comparing the calculated excess peak to the given threshold, and the noise variance \( \sigma_{i}^{2} { (}i = 0, \ldots ,255) \) of each gray value is obtained according to the noise model, with the expression as follows:

$$ \sigma_{i}^{2} = a^{'} i + b^{'} \, i = 0, \ldots ,255 $$
(30)

The final average noise variance is obtained according to the above equation to be the noise variance yield of the noisy image, with the expression as follows:

$$ \sigma_{avg}^{2} = \frac{1}{256}\sum\limits_{i = 0}^{255} {\sigma_{i}^{2} } $$
(31)

Therefore, this formula can be used to accurately estimate the Gaussian and mixed noises, which is not affected by the video image texture and more stable than PCA, making video denoising have a greater application range.

4 Test Results

To verify the effectiveness of the noise estimation algorithm in this paper, four groups of videos are selected, akiyo, foreman, salesman and football respectively [12]. Add the Gaussian noise or Poisson-Gaussian noise into the video images, and compare the estimated noise level with true values and other noise estimation algorithms. The noise estimate error is defined as \( \delta (\sigma ) = \left| {\hat{\sigma } - \sigma } \right| \), which is the absolute value of the difference between the estimated error and the true error, and the error is estimated by each algorithm to measure the accuracy of the noise estimation. Respectively remove the 3rd frame in the test videos for comparison. The 3rd frame images the four sequences of non-noise videos are as follows (Fig. 1):

Fig. 1.
figure 1

The 3rd of the original video frames

  1. (1)

    The error comparison results of adding the Gaussian noises 10, 20, 30 are shown in the following Table 1:

    Table 1. Gaussian noise estimation error comparison

    From the above table we can see that in the case of simply adding Gaussian noises, the noise variance estimated by the noise estimation algorithm in this paper has a very small difference with the true noise variance, and it is more accurate than the noise level estimated by the compared algorithms in most cases.

  2. (2)

    The error comparison results for Poisson-Gaussian mixed noises of adding parameters \( a/b \) are shown in the following Table 2:

    Table 2. Comparison of Gaussian-Poisson noises estimation errors

The comparison in the above table can clearly tell that when adding Gaussian-Poisson mixed noise, the average variance of the noise estimation algorithm in this paper is very close to the average variance of the real noise, and more accurate than the compared algorithms, and has a higher accuracy especially than the PCA noise estimation algorithm. From the above comparison results, we can see that the noise estimation algorithm in this paper can achieve a more accurate noise level, which could not only make a more accurate estimation on Gaussian noise, but obtain accurate results for the Gaussian-Poisson mixed noises. Therefore, the noise estimation algorithm in this paper has a good applicability.

5 Conclusions

This paper has improved the traditional noise estimation algorithms, proposed the mixed noise estimation algorithm combining PCA and variance stabilization transform, and meanwhile introduces the concept of excessive peak to further determine the accuracy of the mixed noise estimation parameters, and the parameters of the VST transform are estimated by minimizing the excessive peak of the noise distribution. In addition, the noise estimation algorithm is combined with the classical video denoising algorithm to achieve a better video denoising effect and bring video denoising a wider range of applications.