Single-channel noise reduction via semi-orthogonal transformations and reduced-rank filtering☆
Introduction
The problem of single-channel noise reduction is to recover a clean speech signal of interest from its microphone observations (Benesty, Chen, 2011, Benesty, Chen, Huang, Cohen, 2009, Loizou, 2007). Due to the importance and broad range of applications, a great deal of efforts have been devoted to this problem over the last decades and many algorithms have been developed e.g., Wiener (1949), Boll (1979), Berouti et al. (1979), Lim and Oppenheim (1979), Ephraim and Malah (1984), Trees and Harry (2001). However, these algorithms achieve noise reduction generally by paying a price of adding speech distortion. One exceptional case is the reduced-rank or subspace method, which has the potential to introduce less distortion if the desired signal correlation matrix is rank deficient and this rank is correctly estimated. This paper is, therefore, devoted to the reduced-rank filtering methods.
The idea of “reduced rank” was first developed in the field of signal estimation (Huffel, 1993, Moor, 1993, Scharf, 1991, Scharf, Tufts, 1987, Tufts, Kumaresan, 1982, Tufts, Kumaresan, 1982). It was then applied to the noise reduction problem in the so-called subspace approach (Dendrinos et al., 1991), where the singular value decomposition (SVD) of the noisy data matrix was used to estimate and remove the noise subspace and the estimate of the clean signal was then obtained from the remaining subspace. This approach gained more popularity when Ephraim and Van Trees proposed to decompose the covariance matrix of the noisy observation vector (Ephraim and Trees, 1995). The subspace method was found better than the widely used spectral subtraction (Boll, 1979) for noise reduction in the sense that it has less speech distortion with little music residual noise. Today, the principle has been studied to deal with not only white (Ephraim and Trees, 1995) but also colored noise (Hu, Loizou, 2003, Huang, Zhao, 1998, Huang, Zhao, 2000, Mittal, Phamdo, 2000, Rezayee, Gazor, 2001). Besides the SVD (Moor, 1993, Scharf, 1991, Scharf, Tufts, 1987, Tufts, Kumaresan, 1982, Tufts, Kumaresan, 1982) and the eigenvalue decomposition (EVD) (Ephraim, Trees, 1995, Hu, Loizou, 2003), truncated (Q)SVD (Hansen, Jensen, 1998, Jensen, Hansen, Hansen, Sorensen, 1995) and triangular decompositions (Hansen and Jensen, 2007) were also investigated in the subspace approach. More recent works on reduced-rank filtering can be found in Hansen and Jensen (2013), Nørholm et al. (2014), Zhang et al. (2014).
This paper is also concerned with the application of reduced-rank principle to noise reduction. But unlike most existing work (e.g., Dendrinos, Bakamidis, Carayannis, 1991, Ephraim, Trees, 1995, Goldstein, Reed, Dudgeon, Guerci, 1999, Goldstein, Reed, Scharf, 1998, Scharf, 1991, Scharf, Tufts, 1987), which exploits the structure of either the signal data or covariance matrix to find the signal and noise subspaces, this paper develops a more flexible framework. We choose a semi-orthogonal matrix to do data transformation instead of directly decomposing the subspaces. The semi-orthogonal matrix is not unique, and it can be derived under different criteria. The resulting semi-orthogonal matrices represent the characteristic of both the signal and noise, and thus might be used in various conditions. Another contribution of the paper is the derivation of the optimal filters under the reduced-rank framework.
In this framework, noise reduction is achieved in three steps. We first prefilter the full-length observed vector by a semi-orthogonal matrix, resulting in a reduced-dimension vector. In other words, we apply a linear transformation that transforms the observed data vector to a new coordinate system where the basis are defined by the columns of the semi-orthogonal matrix. This is workable because the dimension of the signal subspace is smaller than that of the observed noisy signal space. The second step is to design an optimal reduced-rank filter and apply this filter to get an estimate of the clean speech in the transformed domain. Note that the optimal filter is matrix-valued and the noisy signal is processed by a vector-by-vector basis. The estimate of the clean speech in the time domain is finally obtained by an inverse semi-orthogonal transformation. We will discuss how to derive different semi-orthogonal transformations under certain estimation criteria and how to design different reduced-rank optimal filters. We will also illustrate the flexibility of this new framework in controlling the compromise between noise reduction and speech distortion.
The rest of the paper is organized as follows. In Section 2, the signal model and problem formulation are presented. Section 3 gives the definition of the semi-orthogonal transformation. Then in Section 4, the principle of linear filtering with a rectangular matrix is discussed. Section 5 presents some performance measures for evaluation and analysis of noise reduction. In Section 6, different optimal filters are derived under a given semi-orthogonal transformation. Different semi-orthogonal transformations are discussed in Section 7. Some simulations are presented in Section 8. Finally, conclusions are drawn in Section 9.
Section snippets
Signal model and problem formulation
The noise reduction problem considered in this paper is one of recovering the desired speech signal x(k), k being the discrete-time index, of zero mean from the noisy observation (sensor signal) (Benesty, Chen, 2011, Benesty, Chen, Huang, Cohen, 2009): where v(k), assumed to be a zero-mean random process, is the unwanted additive noise that can be either white or colored but is uncorrelated with x(k). All signals are considered to be real and broadband.
The signal model given in
Semi-orthogonal transformation
We recall that x(k) is the desired signal vector that we want to estimate from the observation signal vector, y(k).
Let be a semi-orthogonal matrix of size L × P, i.e., where IP is the P × P identity matrix and P ≤ L. We define the transformed desired signal vector of length P as where . Now, instead of estimating the vector x(k) of length L, we will estimate the shorter vector x′(k) of length P.
In (6), x′(k) is expressed
Linear filtering with a rectangular matrix
In the general linear filtering approach, we estimate the transformed desired signal vector, x′(k), by applying a linear transformation to y(k), i.e.,
where z′(k) is the estimate of x′(k), H′ is a rectangular filtering matrix of size P × L, is the filtered desired signal, and is the residual noise. Therefore, the estimate of x(k) is where is a filtering matrix of size L × L.
We find that the
Performance measures
In this section, we define two categories of performance measures. The first category evaluates the noise reduction performance while the second one evaluates the signal distortion. We also discuss the very convenient mean-squared error (MSE) criterion and show how it is related to the performance measures.
Optimal rectangular filtering matrices
In this section, we briefly discuss the most important optimal rectangular filtering matrices for noise reduction, which explicitly depend on the semi-orthogonal matrix T.
Examples of semi-orthogonal matrices
In Section 3, we gave an example for the choice of T. In this section, we show some other important possibilities depending on what we want to achieve.
Simulations and experiments
In this section, we study, using simulations and experiments, the impact of some important parameters on the noise reduction performance of the optimal filters derived in Section 6.
Conclusion
This paper investigated the use of the reduced-rank principle to the problem of single-channel noise reduction in the framework of semi-orthogonal transformations. Under this framework, we derived the maximum SNR, the Wiener, the MVDR, and the tradeoff filters. Simulation results showed that these reduced-rank optimal filters can yield better output SNR and similar PESQ performance as compared to their full-rank counterparts if the rank parameter is properly chosen. Furthermore, these
References (33)
- et al.
Speech enhancement from noise: A regenerative approach
Speech Commun.
(1991) - et al.
An energy-constrained signal subspace method for speech enhancement and recognition in white and colored noises
Speech Commun.
(1998) The SVD and reduced rank signal processing
Signal Process.
(1991)- et al.
Optimal Time-domain Noise Reduction Filters: A Theoretical Study
(2011) - et al.
Noise Reduction in Speech Processing
(2009) - et al.
Enhancement of speech corrupted by acoustic noise estimation using linear prediction
Proceedings of the IEEE International Conference on Acoustics, Speech, Signal Processing (ICASSP)
(1979) Suppression of acoustic noise in speech using spectral subtraction
IEEE Trans. Acoust., Speech, Signal Process
(1979)Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging
IEEE Trans. Signal Process
(2003)- et al.
Speech enhancement using a minimum mean square error short-time spectral amplitude estimator
IEEE Trans. Acoust., Speech, Signal Process.
(1984) - et al.
A signal subspace approach for speech enhancement
IEEE Trans. Speech, Audio Process
(1995)