Elsevier

Speech Communication

Volume 78, April 2016, Pages 73-83
Speech Communication

Single-channel noise reduction via semi-orthogonal transformations and reduced-rank filtering

https://doi.org/10.1016/j.specom.2015.12.007Get rights and content

Highlights

  • A general framework is developed that combines semi-orthogonal transformation and reduced-rank filtering for noise reduction.

  • Under this new framework, several optimal reduced-rank filters are derived, including the maximum SNR, the Wiener, the tradeoff, and the MVDR filters.

  • Discussions are also provided on how to derive different semi-orthogonal transformations under four estimation criteria, including minimum correlation, minimum MSE, minimum distortion, and minimum residual noise.

  • Simulations are performed and the results show the properties of the deduced optimal reduced-rank filters.

Abstract

This paper investigates the problem of single-channel noise reduction in the time domain. The objective is to find a lower dimensional filter that can yield a noise reduction performance as close as possible to or even better than that obtained by the full-rank solution. This is achieved in three steps. First, we transform the observation signal vector sequence, through a semi-orthogonal matrix, into a sequence of transformed signal vectors with a reduced dimension. Second, a reduced-rank filter is applied to get an estimate of the clean speech in the transformed domain. Third, the estimate of the clean speech in the time domain is obtained by an inverse semi-orthogonal transformation. The focus of this paper is on the derivation of semi-orthogonal transformations under certain estimation criteria in the first step and the design of the reduced-rank optimal filters that can be used in the second step. We show how noise reduction using the principle of rank reduction can be cast as an optimal filtering problem, and how different semi-orthogonal transformations affect the noise reduction performance. Simulations are performed under various conditions to validate the deduced filters for noise reduction.

Introduction

The problem of single-channel noise reduction is to recover a clean speech signal of interest from its microphone observations (Benesty, Chen, 2011, Benesty, Chen, Huang, Cohen, 2009, Loizou, 2007). Due to the importance and broad range of applications, a great deal of efforts have been devoted to this problem over the last decades and many algorithms have been developed e.g., Wiener (1949), Boll (1979), Berouti et al. (1979), Lim and Oppenheim (1979), Ephraim and Malah (1984), Trees and Harry (2001). However, these algorithms achieve noise reduction generally by paying a price of adding speech distortion. One exceptional case is the reduced-rank or subspace method, which has the potential to introduce less distortion if the desired signal correlation matrix is rank deficient and this rank is correctly estimated. This paper is, therefore, devoted to the reduced-rank filtering methods.

The idea of “reduced rank” was first developed in the field of signal estimation (Huffel, 1993, Moor, 1993, Scharf, 1991, Scharf, Tufts, 1987, Tufts, Kumaresan, 1982, Tufts, Kumaresan, 1982). It was then applied to the noise reduction problem in the so-called subspace approach (Dendrinos et al., 1991), where the singular value decomposition (SVD) of the noisy data matrix was used to estimate and remove the noise subspace and the estimate of the clean signal was then obtained from the remaining subspace. This approach gained more popularity when Ephraim and Van Trees proposed to decompose the covariance matrix of the noisy observation vector (Ephraim and Trees, 1995). The subspace method was found better than the widely used spectral subtraction (Boll, 1979) for noise reduction in the sense that it has less speech distortion with little music residual noise. Today, the principle has been studied to deal with not only white (Ephraim and Trees, 1995) but also colored noise (Hu, Loizou, 2003, Huang, Zhao, 1998, Huang, Zhao, 2000, Mittal, Phamdo, 2000, Rezayee, Gazor, 2001). Besides the SVD (Moor, 1993, Scharf, 1991, Scharf, Tufts, 1987, Tufts, Kumaresan, 1982, Tufts, Kumaresan, 1982) and the eigenvalue decomposition (EVD) (Ephraim, Trees, 1995, Hu, Loizou, 2003), truncated (Q)SVD (Hansen, Jensen, 1998, Jensen, Hansen, Hansen, Sorensen, 1995) and triangular decompositions (Hansen and Jensen, 2007) were also investigated in the subspace approach. More recent works on reduced-rank filtering can be found in Hansen and Jensen (2013), Nørholm et al. (2014), Zhang et al. (2014).

This paper is also concerned with the application of reduced-rank principle to noise reduction. But unlike most existing work (e.g., Dendrinos, Bakamidis, Carayannis, 1991, Ephraim, Trees, 1995, Goldstein, Reed, Dudgeon, Guerci, 1999, Goldstein, Reed, Scharf, 1998, Scharf, 1991, Scharf, Tufts, 1987), which exploits the structure of either the signal data or covariance matrix to find the signal and noise subspaces, this paper develops a more flexible framework. We choose a semi-orthogonal matrix to do data transformation instead of directly decomposing the subspaces. The semi-orthogonal matrix is not unique, and it can be derived under different criteria. The resulting semi-orthogonal matrices represent the characteristic of both the signal and noise, and thus might be used in various conditions. Another contribution of the paper is the derivation of the optimal filters under the reduced-rank framework.

In this framework, noise reduction is achieved in three steps. We first prefilter the full-length observed vector by a semi-orthogonal matrix, resulting in a reduced-dimension vector. In other words, we apply a linear transformation that transforms the observed data vector to a new coordinate system where the basis are defined by the columns of the semi-orthogonal matrix. This is workable because the dimension of the signal subspace is smaller than that of the observed noisy signal space. The second step is to design an optimal reduced-rank filter and apply this filter to get an estimate of the clean speech in the transformed domain. Note that the optimal filter is matrix-valued and the noisy signal is processed by a vector-by-vector basis. The estimate of the clean speech in the time domain is finally obtained by an inverse semi-orthogonal transformation. We will discuss how to derive different semi-orthogonal transformations under certain estimation criteria and how to design different reduced-rank optimal filters. We will also illustrate the flexibility of this new framework in controlling the compromise between noise reduction and speech distortion.

The rest of the paper is organized as follows. In Section 2, the signal model and problem formulation are presented. Section 3 gives the definition of the semi-orthogonal transformation. Then in Section 4, the principle of linear filtering with a rectangular matrix is discussed. Section 5 presents some performance measures for evaluation and analysis of noise reduction. In Section 6, different optimal filters are derived under a given semi-orthogonal transformation. Different semi-orthogonal transformations are discussed in Section 7. Some simulations are presented in Section 8. Finally, conclusions are drawn in Section 9.

Section snippets

Signal model and problem formulation

The noise reduction problem considered in this paper is one of recovering the desired speech signal x(k), k being the discrete-time index, of zero mean from the noisy observation (sensor signal) (Benesty, Chen, 2011, Benesty, Chen, Huang, Cohen, 2009): y(k)=x(k)+v(k),where v(k), assumed to be a zero-mean random process, is the unwanted additive noise that can be either white or colored but is uncorrelated with x(k). All signals are considered to be real and broadband.

The signal model given in

Semi-orthogonal transformation

We recall that x(k) is the desired signal vector that we want to estimate from the observation signal vector, y(k).

Let T=[t0t1tP1]be a semi-orthogonal matrix of size L × P, i.e., TTT=IP, where IP is the P × P identity matrix and PL. We define the transformed desired signal vector of length P as x(k)=TTx(k)=[x0(k)x1(k)xP1(k)]T,where xp(k)=tpTx(k). Now, instead of estimating the vector x(k) of length L, we will estimate the shorter vector x′(k) of length P.

In (6), x′(k) is expressed

Linear filtering with a rectangular matrix

In the general linear filtering approach, we estimate the transformed desired signal vector, x′(k), by applying a linear transformation to y(k), i.e., z(k)=Hy(k)=H[x(k)+v(k)]=xfd(k)+vrn(k),

where z′(k) is the estimate of x′(k), H′ is a rectangular filtering matrix of size P × L, xfd(k)=Hx(k)is the filtered desired signal, and vrn(k)=Hv(k)is the residual noise. Therefore, the estimate of x(k) is z(k)=Tz(k)=THy(k)=Hy(k),where H=TH is a filtering matrix of size L × L.

We find that the

Performance measures

In this section, we define two categories of performance measures. The first category evaluates the noise reduction performance while the second one evaluates the signal distortion. We also discuss the very convenient mean-squared error (MSE) criterion and show how it is related to the performance measures.

Optimal rectangular filtering matrices

In this section, we briefly discuss the most important optimal rectangular filtering matrices for noise reduction, which explicitly depend on the semi-orthogonal matrix T.

Examples of semi-orthogonal matrices

In Section 3, we gave an example for the choice of T. In this section, we show some other important possibilities depending on what we want to achieve.

Simulations and experiments

In this section, we study, using simulations and experiments, the impact of some important parameters on the noise reduction performance of the optimal filters derived in Section 6.

Conclusion

This paper investigated the use of the reduced-rank principle to the problem of single-channel noise reduction in the framework of semi-orthogonal transformations. Under this framework, we derived the maximum SNR, the Wiener, the MVDR, and the tradeoff filters. Simulation results showed that these reduced-rank optimal filters can yield better output SNR and similar PESQ performance as compared to their full-rank counterparts if the rank parameter is properly chosen. Furthermore, these

References (33)

  • DendrinosM. et al.

    Speech enhancement from noise: A regenerative approach

    Speech Commun.

    (1991)
  • HuangJ. et al.

    An energy-constrained signal subspace method for speech enhancement and recognition in white and colored noises

    Speech Commun.

    (1998)
  • ScharfL.L.

    The SVD and reduced rank signal processing

    Signal Process.

    (1991)
  • BenestyJ. et al.

    Optimal Time-domain Noise Reduction Filters: A Theoretical Study

    (2011)
  • BenestyJ. et al.

    Noise Reduction in Speech Processing

    (2009)
  • BeroutiM. et al.

    Enhancement of speech corrupted by acoustic noise estimation using linear prediction

    Proceedings of the IEEE International Conference on Acoustics, Speech, Signal Processing (ICASSP)

    (1979)
  • BollS.F.

    Suppression of acoustic noise in speech using spectral subtraction

    IEEE Trans. Acoust., Speech, Signal Process

    (1979)
  • CohenI.

    Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging

    IEEE Trans. Signal Process

    (2003)
  • EphraimY. et al.

    Speech enhancement using a minimum mean square error short-time spectral amplitude estimator

    IEEE Trans. Acoust., Speech, Signal Process.

    (1984)
  • EphraimY. et al.

    A signal subspace approach for speech enhancement

    IEEE Trans. Speech, Audio Process

    (1995)
  • GoldsteinJ.S. et al.

    Reduced-rank adaptive filtering

    IEEE Trans. Signal Process

    (1997)
  • GoldsteinJ.S. et al.

    A multistage matrix Wiener filter for subspace detection

    Proceedings of the IT Workshop on Detection, Estimation and Classification and Imaging

    (1999)
  • GoldsteinJ.S. et al.

    A multistage representation of the Wiener filter based on orthogonal projections

    IEEE Trans. Inf. Theory

    (1998)
  • HansenP.C. et al.

    FIR filter representations of reduced-rank noise reduction

    IEEE Trans. Signal Process

    (1998)
  • HansenP.C. et al.

    Subspace-based noise reduction for speech signals via diagonal and triangular matrix decompositions: Survey and analysis

    EURASIP J. Adv. Signal Process

    (2007)
  • HansenP.C. et al.

    A class of optimal rectangular filtering matrices for single-channel signal enhancement in the time domain

    IEEE Trans. Audio, Speech Lang. Process

    (2013)
  • Cited by (0)

    This work was supported in part by the NSFC under Grant No. 61425005.

    View full text