An adaptive rank-sparsity K-SVD algorithm for image sequence denoising

doi:10.1016/j.patrec.2014.03.003

Pattern Recognition Letters

Volume 45, 1 August 2014, Pages 46-54

https://doi.org/10.1016/j.patrec.2014.03.003 Get rights and content

Highlights

•
We propose an algorithm for removing Gaussian noise from a given image sequence.
•
We formulate it as an optimization problem on a propagated dictionary.
•
The propagated dictionary is adaptively trained by a rank-sparsity representation.
•
Restoration of signals is adaptively determined in terms of the noise level.

Abstract

In this paper, we propose an algorithm for the removal of additive white Gaussian noise (AWGN) from a given image sequence. By extending a frame in the spatial and temporal dimensions, the sequence is transformed into the volumetric data in which each frame includes both the spatial and temporal correlation. Image sequence denoising is then formulated as an optimization problem that can be iteratively solved by constructing a rank-sparsity representation on a propagated dictionary. The proposed algorithm effectively trains this dictionary by adaptively determining the required number of iterations. Restoration of the volumetric data is adaptively determined in terms of the noise level. The results on some standard data sets show that the proposed algorithm outperforms the K-singular value decomposition (K-SVD) algorithm and the sparse K-SVD algorithm. If a sequence is characterized by global motion (the moving objects in a scene with similar trajectories, i.e., they moves as a unit) or high motion activity, the performance of the proposed algorithm is comparable to that of block-matching and 4-D filtering (BM4D) and video block-matching and 4-D filtering (V-BM4D).

Introduction

Denoising is a fundamental problem of image processing. In recent decades, many approaches have been investigated from diverse points of view [16], [13], [10], [7], [18], [1], [23], [27], [28]. Image sequence (video) denoising is the extended version of this problem, because an image sequence always encloses the inherent temporal correlation between frames (images). However, in practice, some approaches ignore the temporal correlation enclosed in an image sequence and process each frame separately [2], [4]. Other image sequence denoising methods explore the high temporal correlation in an image sequence to achieve better performance [6], [12], [29].

In general, many approaches of image sequence denoising can be divided into two categories depending on the utilization of temporal correlation. The first kind is the motion compensated filters that treat the motion compensation and filtering as two independent problems [5]. Motion compensation is either explicitly applied during preprocessing or implicitly incorporated into the filtering [5]. After preprocessing, the temporal nonstationarity in an image sequence is removed. Then, the estimated trajectories can be applied for denoising either in the signal domain [14] or transformed domain [34], [17]. For these motion compensated filters, a motion compensation is assumed to be helpful when dealing with the dynamic nature of the image sequence. Therefore, they are expected to outperform their non-motion compensated counterparts. However, this is not always true. A motion compensation may be unnecessary or even counterproductive for denoising, because it may propose some inaccurate trajectories that will lead to blur and information loss, especially in an image sequence that contains high levels of noise. Moreover, a motion compensation is itself a difficult problem that adds an additional computational cost.

The other category of image sequence denoising methods is the spatio-temporal approaches that attempt to use the temporal correlation without motion compensation. Most of the spatio-temporal approaches are extended from classic 2-D filters [5], such as the techniques proposed by Buades et al. [6], [12], [29], [30]. These spatio-temporal filters tend to be less sensitive to nonstationarity in both space and time, because they take advantage of the correlation in both directions. This fact implies that it is crucial to make full use of both the spatial and temporal correlation to maximize performance. Spatio-temporal filters can also adapt their parameters for denoising. Because there is no one set of parameters that can fit all sequences, even at a fixed noise level [29], many approaches use adaptive statistical estimation [2], [4], adaptive selection of neighborhood size [3], or adaptive smoothing [19] to achieve better results.

One of the most successful non-motion compensated spatio-temporal filters is the approach reported by Protter and Elad [29]. This technique extends the work by [16], [15] with several modifications. The results reported by Protter and Elad [29] demonstrated that a propagated dictionary can help speed-up the algorithm and lead to an improved denoising performance. This is because the similarity between two adjacent frames can reduce the number of iterations (denoted by K) required to train the dictionary. A further conclusion is that K should not be constant, but rather depends on the noise level [29]. However, no quantitative result is given.

The recent interest in denoising is related to low rank representation (LRR), which shows an excellent performance on many benchmark data sets [21], [20], [31]. LRR is able to automatically correct corrupted data [21], so it is mainly applied to reveal the actual segmentation of data (in the presence of noise or noise free) that are drawn from a union of multiple subspaces [21], [20], [31], [22]. Compared with sparse representation (SR), LRR is more robust to noise and outliers, because LRR is better at capturing the global structure of data [21], [20]. At the same time, the applications based on low-rank and sparse matrix decompositions have been reported for object detection [32], image classification [33], image inpainting [11], and dynamic magnetic resonance imaging restoration [26].

In this paper, we propose an algorithm similar to the K-singular value decomposition (K-SVD) algorithm, based on the foundational work by [16], [15], [29], to remove additive white Gaussian noise (AWGN) from image sequences. We propose three extensions to the original algorithm. The first is the rank-sparsity representation produced by solving an optimization problem, in which the representation is combined from LRR and SR. This is motivated by the conclusion that the exact solution to the problem of decomposing a matrix into the sum of a low-rank matrix and a sparse matrix can be found by minimizing the sum of the nuclear norm and the $l_{1}$ norm [8], [9]. Unlike some other methods [32], [8], [9], [29], the low-rank matrix and the sparse matrix are not separately used for different purposes. In fact, we propose that the sum of the low-rank matrix and the sparse matrix can be regarded as the hybrid representation matrix of signals on a specific dictionary, i.e., the identity matrix. Thus, signals are assumed to be linearly restored by the hybrid representation matrix on a redundant dictionary that is adaptively learned from the noisy signals. This is similar to the assumption by the authors of [16], [15], [29]. But we are interested in the rank-sparsity representation matrix of signals for training dictionaries.

The second extension relates to the adaptivity of K. We describe a method that experimental determines K in terms of the similarity between two adjacent frames in the transformed volumetric data. On the contrary, K is a constant in [16], [15], [29].

The last extension relates to the adaptivity of $λ$ , the parameter that balances the method of signal restoration. According to the noise level, $λ$ is adaptively determined, unlike [16], [15], [29] where it is constant.

The rest of this paper is organized as follows: some related work is presented in Section 2, and the proposed method is discussed in Section 3. Section 4 contains the experimental results obtained by the proposed algorithm. Finally, our conclusions are given in Section 5.

Section snippets

Related work

In this section, we will present some fundamental preliminaries. For clarity, denote a matrix $A = [a_{1} \dots a_{m}]$ , where $a_{i}$ ( $1 ⩽ i ⩽ m$ ) is the ith column vector of A. Denote a column vector $v = {[v_{1} \dots v_{n}]}^{T}$ . For a given index $I = {i_{1}, \dots, i_{p}}$ , denote a sub-matrix of A and a sub-vector of v by $A_{I} = [a_{i_{1}} \dots a_{i_{p}}]$ and $v_{I} = {[v_{i_{1}} \dots v_{i_{p}}]}^{T}$ , respectively, where $p ⩽ \min (m, n)$ . Define $A_{I, J}$ as a sub-matrix of A, including the rows and columns indexed by I and J, respectively, where $J = {j_{1}, \dots, j_{q}}$ , and $q ⩽ m$ .

The proposed method

In this section, we introduce the proposed method. We transform an image sequence into the volumetric data in which each frame includes both the spatial and temporal correlation, and then compute the similarity between two adjacent frames to adaptively determine K. We then formulate the problem of denoising as an optimization problem by the rank-sparsity representation on an adaptively learned dictionary. Finally, we propose a K-SVD-like algorithm to solve the optimization problem, and restore

Experiments

In this section, we compare the performance of the proposed algorithm with that of several other methods on some standard test data sets. For simplicity, unless stated otherwise, the parameters of the proposed algorithm in this section were set as: $n = 8 \times 8$ , $M = 2000$ , $k = 256$ , $Δ t = 3$ , $ρ = 1.15$ . K and $λ$ were determined by (8), (14), respectively. The performance of each algorithm was evaluated in terms of the PSNR results and the visual quality of the restored frames. For fairness, the source codes (or

Conclusion

In this paper, we have proposed an adaptive spatio-temporal filter for image sequence denoising. The proposed method makes full use of both the spatial and temporal correlation in an image sequence by constructing the volumetric data. The similarity between two adjacent frames of the volumetric data is rather high, even in the presence of high noise. This similarity motivates the technique of training a propagated dictionary for each frame by using the rank-sparsity representation. Restoration

References (34)

T.J. Abrahamsen et al.
Sparse non-linear denoising: Generalization performance and pattern reproducibility in functional mri
Pattern Recogn. Lett.
(2011)
Y. Liu et al.
An efficient matrix factorization based low-rank representation for subspace clustering
Pattern Recogn.
(2013)
C.T. Lu et al.
Denoising of salt-and-pepper noise corrupted image using modified directional-weighted-median filter
Pattern Recogn. Lett.
(2012)
A. Majumdar et al.
Non-convex algorithm for sparse and low-rank recovery: Application to dynamic mri reconstruction
Magnetic Reson. Imag.
(2013)
G. Noel et al.
Bilateral mesh filtering
Pattern Recogn. Lett.
(2012)
J. Boulanger, C. Kervrann, P. Bouthemy, An adaptive statistical method for denoising 4d fluorescence image sequences...
J. Boulanger et al.
Space-time adaptation for patch-based image sequence restoration
IEEE Trans. Pattern Anal. Mach. Intell.
(2007)
J. Boulanger et al.
Patch-based nonlocal functional for denoising fluorescence microscopy image sequences
IEEE Transons on Medical Imaging
(2010)
J.C. Brailean et al.
Noise reduction filters for dynamic image sequences: a review
Proc. IEEE
(1995)
A. Buades, B. Coll, J.M. Morel, Denoising image sequences does not require motion estimation, in: Advanced Video and...

A. Buades et al.

Image denoising methods: a new nonlocal principle

SIAM Rev.

(2010)

E.J. Candès et al.

Robust principal component analysis?

J. ACM JACM

(2011)

V. Chandrasekaran et al.

Rank-sparsity incoherence for matrix decomposition

SIAM J. Optim.

(2011)

P. Chatterjee et al.

Clustering-based denoising with locally learned dictionaries

IEEE Trans. Image Process.

(2009)

D.Q. Chen et al.

Image inpainting based on low-rank and joint-sparse matrix recovery

Electron. Lett.

(2013)

K. Dabov, A. Foi, K. Egiazarian, Video denoising by sparse 3D transform-domain collaborative filtering, in: Proc. 15th...

K. Dabov et al.

Image denoising by sparse 3-d transform-domain collaborative filtering

IEEE Trans. Image Process.

(2007)

Cited by (13)

Denoising atomic resolution 4D scanning transmission electron microscopy data with tensor singular value decomposition
2020, Ultramicroscopy
Citation Excerpt :
Application of NLPCA on 3D atomic resolution STEM EDS spectrum image data has been reported before [35], and the parameters optimized for STEM EDS data were used to denoise 4D STEM data. BM4D was proposed and widely applied to MRI data in Ref. [38], and adapted V-BM4D which was designed to handle time sequences [60] has been applied to denoise both for real-life photos [61] and microscopy images [62]. Considered that our data has different feature sizes and periodicity than MRI data, we have optimized the denoising parameters of BM4D on our own data.
Tensor singular value decomposition (SVD) is a method to find a low-dimensional representation of data with meaningful structure in three or more dimensions. Tensor SVD has been applied to denoise atomic-resolution 4D scanning transmission electron microscopy (4D STEM) data. On data simulated from a SrTiO₃ [100] perfect crystal and a Si [110] edge dislocation, tensor SVD achieved an average peak signal-to-noise ratio (PSNR) of ~40 dB, which matches or exceeds the performance of other denoising methods, with processing times at least 100 times shorter. On experimental data from SrTiO₃ [100] and LiZnSb [11 $\bar{2}$ 0]/GaSb [110] samples, tensor SVD denoises multiple GB 4D STEM data sets in ten minutes on a typical personal computer. Denoising with tensor SVD improves both convergent beam electron diffraction patterns and virtual-aperture annular dark field images.
Kernel transform learning
2017, Pattern Recognition Letters
Citation Excerpt :
The problem with K-SVD is that it is slow, since it requires computing the SVD in every iteration and updating the coefficients via orthogonal matching pursuit. Dictionary learning finds applications in inverse problems like denoising [19,20] and reconstruction [21]. It also finds a variety of applications in computer vision where the learnt coefficients are used as features [22].
This work proposes kernel transform learning. The idea of dictionary learning is well known; it is a synthesis formulation where a basis is learnt along with the coefficients so as to generate/synthesize the data. Transform learning is its analysis equivalent; the transforms operates/analyses on the data to generate the coefficients. The concept of kernel dictionary learning has been introduced in the recent past, where the dictionary is represented as a linear combination of non-linear version of the data. Its success has been showcased in feature extraction. In this work we propose to kernelize transform learning on line similar to kernel dictionary learning. An efficient solution for kernel transform learning has been proposed – especially for problems where the number of samples is much larger than the dimensionality of the input samples making the kernel matrix very high dimensional. Kernel transform learning has been compared with other representation learning tools like autoencoder, restricted Boltzmann machine as well as with dictionary learning (and its kernelized version). Our proposed kernel transform learning yields better results than all the aforesaid techniques; experiments have been carried out on benchmark databases.
Color video denoising using epitome and sparse coding
2015, Expert Systems with Applications
Citation Excerpt :
Existing denoising methods can be categorized into spatial and transform domain, respectively where the spatial domain (Aharon & Elad, 2008; Benoît et al., 2011; Cheung, Frey, & Jojic, 2008; Elad & Aharon, 2006; Jojic et al., 2003; Mairal et al., 2008; Peyré, 2009; Protter & Elad, 2009) utilizes pixel information to denoise, while the transform domain (Blu & Luisier, 2007; Dabov, Foi, & Egiazarian, 2007; Dai, Au, Pang, & Zou, 2013; Dai et al., 2010; Eksioglu, 2014; Varghese & Wang, 2010; Wang, Yang, & Fu, 2010; Wu, Cao, Tao, & Zhuang, 2013; Yang & Ren, 2011) make use of spatial frequency spectrum to reduce the noise. Some of these research works focus on spatial–temporal approaches without the motion compensation cues (Boulanger et al., 2010; Dabov et al., 2007; Kuang, Zhang, & Yi, 2014; Protter & Elad, 2009; Rubinstein, Zibulevsky, & Elad, 2010), while the rest utilize the motion compensation filters (Wang et al., 2010; Yang & Ren, 2011). This paper is primarily focused on video denoising, and therefore only the related work in this area will be reported.
Denoising is a process that remove noise from a signal. In this paper, we present a unified framework to deal with video denoising problems by adopting a two-steps process, namely the video epitome and sparse coding. First, the video epitome will summarize the video contents and remove the redundancy information to generate a single compact representation to describe the video content. Second, employing the single compact representation as an input, the sparse coding will generate a visual dictionary for the video sequence by estimating the most representative basis elements. The fusion of these two methods have resulted an enhanced, compact representation for the denoising task. Experiments on the publicly available datasets have shown the effectiveness of our proposed system in comparison to the state-of-the-art algorithms in the video denoising task.
Adaptive Image Denoising with Block-Rotation-Based SVD Filtering and Edge Detection
2021, International Journal of Remote Sensing
Kernel Transform Learning
2019, arXiv
DSD: document sparse-based denoising algorithm
2019, Pattern Analysis and Applications

View all citing articles on Scopus

^☆: This paper has been recommended for acceptance by C. Luengo.

^☆☆: This work was supported by National Basic Research Program of China (973 Program) under Grant No. 2011CB302201, and Specialized Research Fund for the Doctoral Program of Higher Education of China under Grants Nos. 20100181120030 and 20120181130007.

View full text

An adaptive rank-sparsity K-SVD algorithm for image sequence denoising☆,☆☆

Highlights

Abstract

Introduction

Section snippets

Related work

The proposed method

Experiments

Conclusion

Pattern Recogn. Lett.

Pattern Recogn.

Pattern Recogn. Lett.

Magnetic Reson. Imag.

Pattern Recogn. Lett.

Space-time adaptation for patch-based image sequence restoration

IEEE Trans. Pattern Anal. Mach. Intell.

Patch-based nonlocal functional for denoising fluorescence microscopy image sequences

IEEE Transons on Medical Imaging

Noise reduction filters for dynamic image sequences: a review

Proc. IEEE

Image denoising methods: a new nonlocal principle

SIAM Rev.

Robust principal component analysis?

J. ACM JACM

Rank-sparsity incoherence for matrix decomposition

SIAM J. Optim.

Clustering-based denoising with locally learned dictionaries

IEEE Trans. Image Process.

Image inpainting based on low-rank and joint-sparse matrix recovery

Electron. Lett.

Image denoising by sparse 3-d transform-domain collaborative filtering

IEEE Trans. Image Process.