An efficient stabilized fast Newton adaptive filtering algorithm for stereophonic acoustic echo cancellation SAEC

https://doi.org/10.1016/j.compeleceng.2012.02.010Get rights and content

Abstract

This paper addresses the field of stereophonic acoustic echo cancellation (SAEC) by adaptive filtering algorithms. Recently, we have proposed a new version of the fast Newton transversal FNTF algorithm for SAEC applications. In this paper, we propose an efficient modification of this algorithm for the same applications. This new algorithm uses a new proposed and simplified numerical stabilization technique and takes into account the cross-correlation between the inputs of the channels. The basic idea is to introduce a small nonlinearity into each channel that has the effect of reducing the inter-channel coherence while not being noticeable for speech due to self masking. The complexity of the proposed algorithm does not alter the complexity of the original version and is kept less than half the complexity of the fastest two-channel FTF filter version. Simulation results and comparisons with the extended two-channel normalized least mean square NLMS and FTF algorithms are presented.

Graphical abstract

(a) Convergence performance of Algorithm E2CLMS, 2CFTF, 2CFNTF (M2CFNTF with β = 0) and the proposed M2CFNTF algorithm with β = 0.5, α1 = 0.75 and α2 = 0.85. Input: USASI noise. L1 = L2 = 256; N1 = N2=20; Output SNR = 60 dB. There is a jump at block 150 (each block contains 256 samples). (b) Temporal evolution of the MSE for the E2CLMS, 2CFTF, 2CFNTF (M2CFNTF with β = 0) and the proposed M2CFNTF algorithms with β = 0.5, α1 = 0.75 and α2 = 0.85. Input: speech signal of figure 3: L1 = L2 = 1024, μ1 = μ2 = 0.5, λ = 0.99985, μv1 = μv2 = 0.85; output SNR = 60 dB.

  1. Download : Download full-size image

Highlights

► We propose an efficient and stabilized new version of the two-channel FNTF algorithm. ► The proposed two-channel FNTF algorithm uses a new modified technique that well decorrelates the stereophonic input signals. ► An efficient and simple numerical stabilization technique is proposed and applied to the proposed two-channel FNTF algorithm. ► Convergence speed of the proposed FNTF algorithm in SAEC applications is significantly improved.

Introduction

Acoustic echo cancellers are indispensable for communication systems such as teleconferencing in order to decrease echoes which impair the quality of communications. Theoretically, stereophonic acoustic echo cancellation SAEC can be viewed as a simple generalization of the usual single-channel acoustic echo cancellation principle to the two channel case [1], [2], [3]. In SAEC, there is a desire to have far better sound quality and sound localization than what has been provided before. The improvements in quality can be achieved by increasing the signal bandwidth and also by adding more audio channels to the system. This last fact spurred the need for multi-channel acoustic echo cancellers. Two-channel SAEC is most interesting since only complexity issues differ for the more general multi-channel case. A basic scheme for SAEC is sketched in Fig. 1, where we illustrate the concept with a transmission room on the left and a receiving room on the right. The transmission room is sometimes referred to as the far-end and the receiving room as the near-end. In this figure, when we have a signal in the transmission room (that means the source send a signal, which can be a man or a woman speaker), the two microphones mic1 and mic2 receive two amounts of signal, the first received signal amount is the direct source signal modified by the path Gv1 (then captured by mic1) and GV2 (then captured by mic2), respectively. The second signal amounts is the presents diffuse noises Bv1 and Bv2 in the transmission room and captured by mic1 and mic2, respectively. We note that in this paper we do not take into account these two quantities of diffuse noises Bv1 and Bv2 and they are beyond the scope of this paper. On the other hand, in the receiving room, and as depicted in Fig. 1, the echo is due to acoustic coupling between the loud-speakers and the incorporated microphones in this room. In this scheme of Fig. 1, the acoustic echo paths Ch1 and Ch2 in the local room are modeled by adaptive FIR filters hν1 and hν2, from which their added outputs produces an estimate yˆ of the true echo y. Indeed, the physical impulse responses Ch1 and Ch2 are of infinite length; nevertheless it is assumed that the filters hν1 and hν2 are “sufficiently long”, in the sense that the tails of Ch1 and Ch2 not modeled by hν1 and hν2 have low energy and thus can be neglected. Speaking in the sequel of “true” impulse responses means that we only consider the first parts of Ch1 and Ch2 which contain most of the energy, and which are assumed to be of the same size L as the model filters hν1 and hν2. In SAEC for teleconferencing, we have a fundamental problem of the possibility to identify the true impulse responses of the acoustic echo paths. This problem arises from the correlation between the two signals picked up in the remote room in this request. SAEC is fundamentally different from traditional mono echo cancellation. A SAEC, straightforwardly implemented, not only would have to track changing echo paths in the receiving room but also in the transmission room. For example, the canceller has to converge adaptively if one talker stops talking and another starts talking at a different location in the transmission room. There is no adaptive algorithm that can track such a change sufficiently fast and this scheme therefore results in poor echo suppression. Thus, a generalization of the mono AEC in the stereo case does not result in satisfactory performance. The problems of SAEC were first described in an early paper [4], and later on in [3]. The fundamental problem is that the two channels may carry linearly related signals which in turn may make the normal equations, to be solved by the adaptive algorithm, singular. This implies that there is no unique solution to the equation but an infinite number of solutions and it can be shown that all solutions (but the physically true one) depend on the transmission room. As a result, intensive studies have been made of how to handle this properly.

Generalization of the solution to the normal equations in a more practical sense was addressed in Refs. [5] and [6]. It was explained that in practice, the problem is not actually singular but extremely ill-conditioned due to the fact that the length of the adaptive filter is shorter than the echo paths of the transmission room. Furthermore, in practice, the transmission room is not completely stationary, i.e. smooth continuous changes exist, which slightly improves the situation by making the problem somewhat less ill-conditioned [7], [8]. A complete theory of non-uniqueness and characterization of the SAEC solution was presented in Refs. [9] and [10]. It is shown that the only solution to the non-uniqueness problem is to reduce the correlation between the stereo signals and an efficient low complexity method for this purpose was also given in [8] and [9]. Ref. [11] presents a combination of mono and stereo echo cancellation which has the benefit of lower complexity than a pure stereo solution. Currently, attention has been focused on the investigation of other methods that decrease the cross-correlation between the channels in order to get well-behaved estimates of the echo paths [12]. The main problem is how to reduce the correlation sufficiently without affecting stereo perception and sound quality. Early examples of SAEC implementations can be found in [13], [14], [15]. These solutions were presented before the theory and limitations of SAEC were fully understood, and were mainly based on the use of a single adaptive filter for each return channel. The performance of the SAEC is strictly affected by the choice of algorithm more than in the monophonic case. This is easily recognized since the performance of most adaptive algorithms depends on the condition number of the input signal covariance matrix. We have to recall here that there are several efficient other techniques that allow to resolve these problems differently, one of these techniques is the partial update of the filter coefficients techniques as explained in [16] and [17], and the use of the two errors filtering to avoid the problem of channel coherence as described in [18].

In the SAEC application, the condition-number is very high, and algorithms such as the LMS or the NLMS that do not take the coherence between the input signals into account, converge very slowly to the theoretical solution. It is consequently very interesting to study multi-channel adaptive filtering algorithms. A framework for multi-channel adaptive filtering can be found in Refs. [2], [3], [4], [5], [16] and [19].

In [1], we have proposed a new version of the fast Newton transversal FNTF algorithm for SAEC applications. Here, we propose an efficient modification of this algorithm for the same applications. The new proposed algorithm has good performances in SAEC case. This new algorithm takes into consideration the correlation effect of the impulse responses. We also propose, in this paper, a new numerical stabilization technique that allows good properties of the prediction part of the proposed algorithm even with speech signal as input. We describe the basic FNTF algorithms and its modified version and show simulation results to demonstrate the good performance properties of the proposed algorithm in SAEC applications in which the acoustic channels are highly correlated.

This paper is organized as follows: Section 2 explains the SAEC problem and describes the fundamental differences between mono and stereo acoustic echo cancellation. In Section 3, we present two-channel adaptive filtering algorithms with a particular and detailed presentation for the two-channel 2CFNTF algorithm. In Section 4, we give in first, the existing decorrelating versions of the algorithms used, and then we describe the proposed modified two-channel fast Newton transversal M2CFNTF algorithm which takes into account the correlation effects of the channels. In Section 5, we present a new numerical technique that stabilizes the proposed M2CFNTF algorithm. In Section 6, we give a comparison between the proposed and others algorithms in terms of complexity. Finally, simulations results are presented in Section 7. The notations that we have used in this paper are fairly standard. Boldface symbols are used for vectors and matrices. We also have the following notations:

L: length of the adaptive filter;

N: length of the predictors;

t: discrete time index;

(.)T: transpose.

Section snippets

The stereophonic acoustic echo cancellation SAEC problem

In our study, we suppose that the distant room system is stationary, linear and time invariant; we have the following relation:(Xν1)TGν2=(Xν2)TGν1where Gν1 and Gν2 stand for the impulse responses of the source-to-microphone acoustic paths in the remote room as indicated in Fig. 1, Xν1(n) and Xν2(n) stand for vectors of signal samples of the microphones outputs in the same room. Now, we suppose the following recursive least square (RLS) cost function (see Fig. 1 for notations):JL,t=p=1twt-pyt-(h

The adaptive filtering algorithm

In SAEC applications, we use a two-channel adaptive filter. However, there is a very important difference in performance according to the chosen algorithm. In the following, we present known adaptive algorithms which are NLMS, FTF or Fast recursive least square FRLS and 2CFNTF [1] algorithms. These algorithms are selected to be compared with the proposed and stabilized M2FNTF one.

The correlation effect on the algorithms

As we have explained before, the SAEC can be viewed as a straightforward generalization of the single-channel acoustic echo cancellation principle [2]. Fig. 1 shows this technique for one microphone in the receiving room (which is represented by the two echo paths, hv1 and hv2, between the two loudspeakers and the microphone). The two reference signals, Xv1 and Xv2, from the transmission room are obtained by two microphones in the case of teleconferencing. These signals are derived by filtering

New numerical stabilization technique of the new M2CFNTF algorithm: Second modification

In [1], we have adapted then generalized a new numerical stabilization method proposed recently in [25], to the two-channel 2CFNTF algorithm. We recall that this technique is inspired form the work in [26], [27], [28]. In this paper, we propose a new version of this technique to be used with the proposed algorithm. The most important difference between these two stabilization techniques lies in the calculation of the a priori backward prediction error in the two prediction parts of the 2CFNTF

Computational complexity

In the computational complexity study of the proposed algorithms, we only take into account multiplication operations. The computational complexity of the fast version of the 2CFNTF algorithm, listed in Section 3.3 and proposed in [1], is 4L+24N multiplications (see Table 2), 4L multiplications for the filtering parts and 24N multiplications for the predictions parts. Here, the complexity is given for the two-channel case (for more details, see Fig. 1). The complexity of the 2CNLMS is 4L and

Description of the signals used in simulations

In this simulation, we have conducted two kinds of experiments according to the signals used. The first is done with stationary USASI noise signals (speech-like spectrum). This signal is real and used in simulations to test the convergence speed of algorithms) sampled at 16 kHz. The second experiment is realized with non-stationary signals (real speech signals) sampled at 16 kHz. We have used two speech samples as signal sources shown in Fig. 2(male) and Fig. 3(female). These signals will be

Conclusion

In this paper, we have derived a modified two-channel version of the fast Newton transversal filter FNTF algorithm. We have compared the performances of the M2CFNTF and the 2CFNTF algorithms with two two-channel adaptive filtering algorithms (the E2CLMS and the 2CFTF algorithms). Simulation results have shown similar performances for the proposed 2CFNTF algorithm [1] in term of convergence speed and tracking ability, with the 2CFTF algorithm in SAEC applications. We have also noted the

Acknowledgements

The author, Dr Mohamed DJENDI, would like to thank the anonymous reviewers for the useful comments that they provided and their overall objective recommendations, which have largely improved the paper.

Mohamed Djendi received the DEUA, Eng. state, and M.Sc. degrees from Blida University of Science and Technology, Algeria, in 1994, 1997 and 2000, respectively, all in Electrical Engineering, communications and control. He received his first Ph.D degree in Electronics-signal and communications from ENP School of Algiers, Algeria, in 2006. In 2010, He received a second Ph.D degree in signal processing and telecommunications from the University of Science and Technology of Rennes, France.

References (28)

  • Shimauchi S, Makino S. Stereo projection echo canceller with true echo path estimation. Proceedings of IEEE ICASSP....
  • Makino S, Strauss K, Shimauchi S, Haneda Y, Nakagawa A. Subband stereo echo canceller using the projection algorithm...
  • J. Benesty et al.

    A better understanding and an improved solution to the problems of stereophonic acoustic echo cancellation

    IEEE Trans Speech Audio Process

    (1998)
  • Benesty J, Morgan DR, Sondhi MM. A better understanding and an improved solution to the problems of stereophonic...
  • Cited by (10)

    • A new simplified fast transversal filter algorithm based on subband approach (SSFTF) for acoustic echo cancellation

      2020, Applied Acoustics
      Citation Excerpt :

      However, when the input signal is highly correlated and the long-length adaptive filter is needed, the convergence speed performance of the LMS/NLMS adaptive filters can be deteriorated seriously [7,8]. This last problem leads to use, in such systems (AEC, active noise control (ANC), and stereophonic AEC (SAEC)), the available algorithms with low complexity, but some of them have slow convergence speed characteristic, which is another constraint for these applications [9–12]. The recursive least squares (RLS) and affine projections algorithm families have been extensively used, in this application, as alternative solution for the previous one, and provided good AEC properties [13,14].

    • A new adaptive filtering algorithm for stereophonic acoustic echo cancellation

      2019, Applied Acoustics
      Citation Excerpt :

      In [10], in order to improve the state-of-art solution, authors propose a hybrid solution that use the frequency shifts on addition of half-wave rectifier. In [3,11,12] authors use the existing efficient algorithm to give a better performance of the SAEC results. Other techniques were proposed in [13–15] by using the widely linear model.

    • Improved subband-forward algorithm for acoustic noise reduction and speech quality enhancement

      2016, Applied Soft Computing Journal
      Citation Excerpt :

      Several adaptive filtering algorithms have been proposed to enhance speech signal and cancel the acoustic noise. From these algorithms we have the recursive lest square (RLS) algorithms and its derived versions [8–10], the fast transversal filters and its derived versions as the fast Newton transversal filter (FNTF) [11–13], the fast sub-sampling Fast Transversal Filter (FTF) and FNTF algorithms [14], the affine projection algorithms and its fast versions [15], etc. However, the most popular adaptive filter is the least-mean square (LMS) algorithm and its normalized version (NLMS) [8–10].

    • Performance analysis of under-modelling stereophonic acoustic echo cancellation by adaptive filtering LMS algorithm

      2012, Computers and Electrical Engineering
      Citation Excerpt :

      An adaptive filter is used to identify the echo paths. The output of the adaptive filter, which is an estimate of the echo signal, can be used to reduce undesirable echoes [4–7]. The SAEC systems allow to have far better sound quality and sound localisation than what has been provided before.

    • Neural Cascade Architecture for Multi-Channel Acoustic Echo Suppression

      2022, IEEE/ACM Transactions on Audio Speech and Language Processing
    View all citing articles on Scopus

    Mohamed Djendi received the DEUA, Eng. state, and M.Sc. degrees from Blida University of Science and Technology, Algeria, in 1994, 1997 and 2000, respectively, all in Electrical Engineering, communications and control. He received his first Ph.D degree in Electronics-signal and communications from ENP School of Algiers, Algeria, in 2006. In 2010, He received a second Ph.D degree in signal processing and telecommunications from the University of Science and Technology of Rennes, France. Currently, he is a full Professor at Blida University. In 2011-12, he holds a Postdoctoral position at University of Rennes—IRISA/ENSSAT. His fields of interest are speech and signal enhancement, adaptive filtering, SAEC, BSS and DSP for communications.

    Reviews processed and approved for publication by Editor-in-Chief Dr. Manu Malek.

    View full text