Abstract
This paper deals with the problem of enhancing the quality of speech signals, which has received growing attention in the last few decades. Many different approaches have been proposed in the literature under various configurations and operating hypotheses. The aim of this paper is to give an overview of the main classes of noise reduction algorithms proposed to-date, focusing on the case of additive independent noise. In this context, we first distinguish between single and multi channel solutions, with the former generally shown to be based on statistical estimation of the involved signals whereas the latter usually employ adaptive procedures (as in the classical adaptive noise cancellation scheme). Within these two general classes, we distinguish between certain sub-families of algorithms. Subsequently, the impact of nonlinearity on the speech enhancement problem is highlighted: the lack of perfect linearity in related processes and the non-Gaussian nature of the involved signals are shown to have motivated several researchers to propose a range of efficient nonlinear techniques for speech enhancement. Finally, the paper summarizes (in tabular form) for comparative purposes, the general features, list of operating assumptions, the relative advantages and drawbacks, and the various types of non-linear techniques for each class of speech enhancement strategy.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Vaseghi, S.V.: Advanced Signal Processing and Digital Noise Reduction, 2nd edn. John Wiley & Sons, Chichester (2000)
O’Shaughnessy, D.: Speech Communications – Human and Machine, 2nd edn. IEEE Computer Society Press, Piscataway (2000)
Benesty, J., Makino, S., Chen, J.: Speech Enhancement. Signal and Communication Technology Series. Springer, Heidelberg (2005)
Ephraim, Y., Cohen, I.: Recent Advancements in Speech Enhancement. In: The Electrical Engineering Handbook, CRC Press, Boca Raton (2005)
Cohen, I., Berdugo, B.H.: Speech enhancement for non-stationary noise environments. Signal Processing 81, 2403–2418 (2001)
Boll, S.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech, Signal Process 27, 113–120 (1979)
Lockwood, P., Boudy, J.: Experiment with a Nonlinear Spectral Subtractor (NSS). Hidden Markov Models and the Projection, for Robust Speech Recognition in Cars. Speech Communications 11, 215–228 (1992)
Ephraim, Y., Malah, D.: Speech Enhancement Using a Minimum Mean Square Error Short Time Spectral Amplitude Estimator. IEEE Trans. Acoust., Speech, Signal Processing 32, 1109–1121 (1984)
Ephraim, Y., Malah, D.: Speech enhancement using a minimum mean square log spectral amplitude estimator. IEEE Trans. Acoust., Speech, Sig. Proc. 33(2), 443–445 (1985)
Cappè, O.: Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor. IEEE Trans. Speech and Audio Proc. 2, 345–349 (1994)
McAulay, R.J., Malpass, M.L.: Speech Enhancement Using a Soft-Decision Noise Suppression Filter. IEEE Trans 28(2) (1980)
Xie, F., Compernolle, D.V.: Speech enhancement by nonlinear spectral estimation - a unifying approach. In: EUROSPEECH’93, pp. 617–620 (1993)
Virag, N.: Single channel speech enhancement based on masking properties of the human auditory system. IEEE Trans. Speech Audio Processing 7, 126–137 (1999)
Ephraim, Y., Van Trees, H.L.: A signal subspace approach for speech enhancement. IEEE Trans. Speech and Audio Proc. 3, 251–266 (1995)
Lev-Ari, H., Ephraim, Y.: Extension of the signal subspace speech enhancement approach to colored noise. IEEE Sig. Proc. Let. 10, 104–106 (2003)
Ephraim, Y.: Statistical-model-based speech enhancement systems. Proc. IEEE 80(10) (1992)
Ephraim, Y.: A Bayesian Estimation Approach for Speech Enhancement Using Hidden Markov Models. IEEE Trans. Signal Processing 40, 725–735 (1992)
Lee, K.Y., McLaughlin, S., Shirai, K.: Speech enhancement based on extended Kalman filter and neural predictive hidden Markov model. In: IEEE Neural Networks for Signal Processing Workshop, September 1996, pp. 302–310 (1996)
Lee, J., Seo, C., Lee, K.Y.: A new nonlinear prediction model based on the recurrent neural predictive hidden Markov model for speech enhancement. In: ICASSP ’02, vol.1, May 2002, pp. 1037–1040 (2002)
Wan, E.A., Nelson, A.T.: Networks for Speech Enhancement. In: Katagiri, S. (ed.) Handbook of Neural Networks for Speech Processing, Artech House, Boston (1999)
Dawson, M.I., Sridharan, S.: Speech enhancement using time delay neural networks. In: Proceedings of the Fourth Australian International Conf. on Speech Science and Technology, December 1992, pp. 152–155 (1992)
Tamura, S.: An analysis of a noise reduction neural network. In: ICASSP’87, pp. 2001–2004 (1987)
Tamura, S.: Improvements to the noise reduction neural network. In: ICASSP’90, vol. 2, pp. 825–828 (1990)
Knecht, W.G.: Nonlinear Noise Filtering and Beamforming Using the Perceptron and Its Volterra Approximation. IEEE Trans. On Speech and Audio Proc. 2(1), part 1 (1994)
Knecht, W., Schenkel, M., Moschytz, G.S.: Neural Network Filters for Speech Enhancement. IEEE Trans. Speech & Audio Proc. 3(6), 433–438 (1995)
Gao, X.-M., Ovaska, S.J., Hartimo, I.O.: Speech signal restoration using an optimal neural network structure. In: IJCNN 96, pp. 1841–1846 (1996)
Gannot, S., Burshtein, D., Weinstein, E.: Iterative and Sequential Kalman Filter-Based Speech Enhancement Algorithms. IEEE Trans. Speech and Audio Proc. 6, 373–385 (1998)
Wan, E.A., Nelson, A.T.: Neural dual extended Kalman filtering: applications in speech enhancement and monaural blind signal separation. In: Proceedings Neural Networks for Signal Processing Workshop (1997)
Vermaak, J., Andrieu, C., Doucet, A., Godsill, S.J.: Particle Methods for Bayesian Modeling and Enhancement of Speech Signals. IEEE Trans. Speech and Audio Processing 10, 173–185 (2002)
Fong, W., Godsill, S.J., Doucet, A., West, M.: Monte Carlo smoothing with application to audio signal enhancement. IEEE Trans. Signal Processing 50, 438–449 (2002)
Wan, E., Van der Merwe, R.: Noise-Regularized Adaptive Filtering for Speech Enhancement. In: Proceedings of EUROSPEECH’99 (Sep. 1999)
Widrow, B., Glover Jr., J.R., McCool, J.M., Kaunitz, J., Williams, C.S., Hearn, R.H., Zeidler, J.R., Dong Jr., E., Goodlin, R.C.: Adaptive Noise Cancelling: Principles and Applications. Proceedings of the IEEE 63(12), 1692–1716 (1975)
Clarkson, P.M.: Optimal and Adaptive Signal Processing. CRC Press, Boca Raton (1993)
Toner, E.: Speech Enhancement Using Digital Signal Processing. PhD thesis, University of Paisley, UK (1993)
Darlington, D.J., Campbell, D.R.: Sub-band Adaptive Filtering Applied to Hearing Aids. In: Proc. ICSLP’96, Philadelphia, USA, pp. 921–924 (1996)
Hussain, A., Campbell, D.R.: Intelligibility improvements using binaural diverse sub-band processing applied to speech corrupted with automobile noise. IEE Proceedings: Vision, Image and Signal Processing 148(2), 127–132 (2001)
Hussain, A., Campbell, D.R.: A Multi-Microphone Sub-Band Adaptive Speech Enhancement System Employing Diverse Sub-Band Processing. International Journal of Robotics & Automation 15(2), 78–84 (2000)
Hussain, A., Squartini, S., Piazza, F.: Novel Sub-band Adaptive Systems Incorporating Wiener Filtering for Binaural Speech Enhancement. In: Faundez-Zanuy, M., Janer, L., Esposito, A., Satue-Villar, A., Roure, J., Espinosa-Duro, V. (eds.) NOLISP 2005. LNCS (LNAI), vol. 3817, pp. 318–327. Springer, Heidelberg (2006)
Cha, I., Kassam, S.A.: Interference Cancellation Using Radial Basis Function Networks. Signal Processing 47, 247–268 (1995)
Vorobyov, S.A., Cichocki, A.: Hyper Radial Basis Function Neural Networks for Interference Cancellation with Nonlinear Processing of Reference Signal. Digital Signal Processing 11(3), 204–221 (2001)
Giannakis, G.B., Dandawate, A.V.: Linear and Non-Linear Adaptive Noise Cancellers. In: Proc. ICASSP 1990, Albuquerque, pp. 1373–1376 (1990)
Amblard, P., Baudois, D.: Non-linear Noise Cancellation Using Volterra Filters, a Real Case Study. In: IEEE Winter Workshop on Nonlinear Digital Signal Processing, Jan. 17-20 (1993)
Brandstein, M.S., Ward, D.B.: Microphone Arrays: Signal Processing Techniques and Applications. Springer, Berlin (2001)
Saruwatari, H., Kajita, S., Takeda, K., Itakura, F.: Speech Enhancement Using Nonlinear Microphone Array Based on Complementary Beamforming. IEICE Trans. Fundamentals E82-A(8), 1501–1510 (1999)
Saruwatari, H., Kajita, S., Takeda, K., Itakura, F.: Speech Enhancement Based on Noise Adaptive Nonlinear Microphone Array. In: EUSIPCO 2000, X European Signal Processing Conference, Tampere, Finland (2000)
Dahl, M., Claesson, I.: A neural network trained microphone array system for noise reduction. In: IEEE Neural Networks for Signal Processing VI, pp. 311–319 (1996)
Lotter, T., Benien, C., Vary, P.: Multichannel Direction-Independent Speech Enhancement using Spectral Amplitude Estimation. Eurasip Journal on Applied Signal Processing 11, 1147–1156 (2003)
Cohen, I., Berdugo, B.: Noise Estimation by Minima Controlled Recursive Averaging for Robust Speech Enhancement. IEEE Signal Processing Letters 9(1), 12–15 (2002)
Cohen, I., Berdugo, B.: Speech enhancement based on a microphone array and log-spectral amplitude estimation. In: Electrical and Electronics Engineers in Israel, the 22nd Convention, Dec. 2002, pp. 4–6 (2002)
Shinde, T., Takeda, K., Itakura, F.: Multiple regression of log-spectra for in-car speech recognition. In: ICSLP-2002, pp. 797–800 (2002)
Li, W., Miyajima, C., Nishino, T., Itou, K., Takeda, K., Itakura, F.: Adaptive Nonlinear Regression using Multiple Distributed Microphones for In-Car Speech Recognition. IEICE Trans. Fundamentals E88-A(7), 1716–1723 (2005)
Parveen, S., Green, P.D.: Speech enhancement with missing data techniques using recurrent neural networks. In: Proc. IEEE ICASSP 2004, Montreal (2004)
Haykin, S.: Adaptive Filter Theory, 4th edn. Prentice Hall Information and System Science Series (Kailath, T. Series Editor). Prentice Hall, Englewood Cliffs (2002)
Kamath, S., Loizou, P.: A multi-band spectral subtraction method for enhancing speech corrupted by colored noise. In: ICASSP (2002)
Gülzow, T., Ludwig, L., Heute, U.: Spectral-Substraction Speech Enhancement in Multirate Systems with and without Non-uniform and Adaptive Bandwidths. Signal Processing 83, 1613–1631 (2003)
Greenwood, V.: A Cochlear Frequency-Position Function for Several Species-29 Years Later. J. Acoustic Soc. Amer. 86(6), 2592–2605 (1990)
Abutalebi, H.R., Sheikhzadeh, H., Brennan, R.L., Freeman, G.H.: A Hybrid Sub-Band System for Speech Enhancement in Diffuse Noise Fields. IEEE Sig. Process. Letters (2003)
Le Bouquin, R., Faucon, G.: Study of a Voice Activity Detector and its Influence on a Noise Reduction System. Speech Communication 16, 245–254 (1995)
Bahoura, M., Rouat, J.: A new approach for wavelet speech enhancement. In: Proc. EUROSPEECH, pp. 1937–(2001)
Bahoura, M., Rouat, J.: Wavelet speech enhancement based on the Teager Energy Operator. IEEE Signal Proc. Lett. 8(1), 10–12 (2001)
Cecchi, S., Bastari, A., Squartini, S., Piazza, F.: Comparing Performances of Different Multiband Adaptive Architectures for Noise Reduction. In: International Conference on Communications, Circuits and Systems (ICCCAS), Guilin, China (2006)
Wolfe, P.J., Godsill, S.J.: Efficient alternatives to the Ephraim and Malah suppression rule for audio signal enhancement. URASIP Journal on Applied Signal Processing (special issue: Digital Audio for Multimedia Communications) 10, 1043–1051 (2003)
Hussain, A., Campbell, D.R.: Binaural sub-band adaptive speech enhancement using artificial neural networks. Speech Communication (Special Issue: Robust Speech Recognition for Unknown Communication Channels) 25, 177–186 (1998)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this chapter
Cite this chapter
Hussain, A., Chetouani, M., Squartini, S., Bastari, A., Piazza, F. (2007). Nonlinear Speech Enhancement: An Overview. In: Stylianou, Y., Faundez-Zanuy, M., Esposito, A. (eds) Progress in Nonlinear Speech Processing. Lecture Notes in Computer Science, vol 4391. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71505-4_12
Download citation
DOI: https://doi.org/10.1007/978-3-540-71505-4_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71503-0
Online ISBN: 978-3-540-71505-4
eBook Packages: Computer ScienceComputer Science (R0)