Abstract
This chapter discusses multi-microphone inverse filtering, which does not use a priori information of room acoustics, such as room impulse responses between the target speaker and the microphones. One major problem as regards achieving this type of processing is the degradation of the recovered speech caused by excessive equalization of the speech characteristics. To overcome this problem, several approaches have been studied based on a multichannel linear prediction framework, since the framework may be able to perform speech dereverberation as well as noise attenuation. Here, we first discuss the relationship between optimal filtering and linear prediction. Then, we review our four approaches, which differ in terms of their treatment of the statistical properties of a speech signal.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
ATR International Speech database. Online (in Japanese). URL http://www.red.atr. co.jp/database_page/digdb.html
Aichner, R., Araki, S., Makino, S., Nishikawa, T., Saruwatari, H.: Time domain blind source separation of non-stationary convolved signals by utilizing geometric beamforming. In: Proc. IEEE Int. Workshop on Neural Networks for Signal Processing, pp. 445–454 (2002)
Allen, J.B., Berkley, D.A.: Image method for efficiently simulating small-room acoustics. J. Acoust. Soc. Am. 65(4), 943–950 (1979)
Atal, B.S.: Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. J. Acoust. Soc. Am. 55(6), 1304–1312 (1974)
Ben-Israel, A., Greville, T.N.E.: Generalized inverses: theory and applications. Springer (1974)
Benesty, J., Makino, S., Chen, J.: Speech enhancement. Springer (2005)
Boll, S.F.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Speech Audio Process. 27(2), 113–120 (1979)
Campbell, S.L., Jr., C.D.M.: Generalized inverses of linear transformations. Dover Publications (1979)
Delcroix, M., Hikichi, T., Miyoshi, M.: Dereverberation and denoising using multichannel linear prediction. IEEE Trans. Audio, Speech, Lang. Process. 15(6), 1791–1801 (2007)
Delcroix, M., Hikichi, T., Miyoshi, M.: Precise dereverberation using multi-channel linear prediction. IEEE Trans. Audio, Speech, Lang. Process. 15(2), 430–440 (2007)
Flanagan, J.L.: Computer-steered microphone arrays for sound transduction in large rooms. J. Acoust. Soc. Am. 78(11), 1508–1518 (1985)
Furui, S.: Digital speech processing, synthesis, and recognition. Marcel Dekker (2001)
Gaubitch, N.D., Naylor, P.A., Ward, D.B.: On the use of linear prediction for dereverberation of speech. In: Proc. Int. Workshop Acoust. Echo Noise Control (IWAENC), vol. 1, pp. 99–102 (2003)
Giannakis, G.B., Hua, Y., Stoica, P., Tong, L.: Signal processing advances in wireless and mobile communications. Prentice–Hall (2001)
Gillespie, B.W., Atlas, L.E.: Acoustic diversity for improved speech recognition in reverberant environments. In: Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), vol. 1, pp. 557–600 (2002)
Gillespie, B.W., Malvar, H.S., Florêncio, D.A.F.: Speech dereverberation via maximumkurtosis subband adaptive filtering. In: Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp. 3701–3704 (2001)
Habets, E.A.P.: Multi-channel speech dereverberation based on a statistical model of late reverberation. In: Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), vol. 4, pp. 173–176 (2005)
Harville, D.A.: Matrix algebra from a statistician’s perspective. Springer (1997)
Haykin, S.: Adaptive filter theory, 3rd edn. Prentice–Hall (1996)
Haykin, S.: Unsupervised adaptive filtering: blind source separation. Wiley Interscience (2000)
Juang, B., Rabiner, L.: Mixture autoregressive hidden Markov models for speech signals. IEEE Trans. Acoust., Speech, Signal Process. ASSP-33(6), 1404–1413 (1985)
Kailath, T., Sayed, A.H., Hassibi, B.: Linear estimation. Prentice–Hall (2000)
Kameoka, H.: Statistical approach to multipitch analysis. Ph.D. thesis, The University of Tokyo (2007)
Kinoshita, K., Delcroix, M., Nakatani, T., Miyoshi, M.: A linear prediction-based microphone array for speech dereverberation in a realistic sound field. In: Proc. of Audio Engineering Society 13th Regional Convention (2007)
Kinoshita, K., Nakatani, T., Miyoshi, M.: Dereverberation of highly reverberant convolutive mixtures based on multi-step linear prediction. In: Proc. Int. Symp. on Circuits and Systems (2008)
Li, K., Swamy, M.N.S., Ahmad, M.O.: An improved voice activity detection using higher order statistics. IEEE Trans. Speech Audio Process. 13(5), 965–974 (2005)
Mitra, S.K.: Optimal inverse of a matrix. Sankhya 37(A), 550–563 (1975)
Miyoshi, M.: Estimating AR parameter-sets for linear-recurrent signals in convolutive mixtures. In: Proc. Int. Sypm. on Independent Component Analysis and Blind Signal Separation (ICA), pp. 585–589 (2003)
Miyoshi, M., Kaneda, Y.: Inverse filtering of room acoustics. IEEE Trans. Speech Audio Process. 36(2), 145–152 (1988)
Nakatani, T., Juang, B., Hikichi, T., Yoshioka, T., Kinoshita, K., Delcroix, M., Miyoshi, M.: Study on speech dereverberation with autocorrelation codebook. Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP) pp. 193–197 (2007)
Nakatani, T., Kinoshita, K., Miyoshi, M.: Harmonicity based blind dereverberation for singlechannel speech signals. IEEE Trans. Audio, Speech, Lang. Process. 15(1), 80–95 (2007)
Nelson, P.A., Orduña-Bustamante, F., Hamada, H.: Multichannel signal processing techniques in the reproduction of sound. J. Audio Eng. Soc. 44(11), 973–989 (1996)
Qiu, W., Hua, Y., Abed-Meraim, K.: A subspace method for the computation of the GCD of polynomials. Automatica 33(4), 741–743 (1997)
Rombouts, S., Heyde, K.: An accurate and efficient algorithm for the computation of the characteristic polynomial of a general square matrix. J. Comput. Phys. 140, 453–458 (1998)
Slock, D.T.M.: Blind fractionally-spaced equalization, perfect-reconstruction filter banks and multichannel lineawr prediction. In: Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), vol. IV, pp. 585–588 (1994)
Sun, X., Douglas, S.: A natural gradient convolutive blind source separation algorithm for speech mixtures. In: Proc. Int. Sypm. on Independent Component Analysis and Blind Signal Separation (ICA), pp. 59–64 (2001)
Tashev, I., Allred, D.: Reverberation reduction for improved speech recognition. In: Proc. Hands-Free Communication and Microphone Arrays (2005)
van Trees, H.L.: Optimum array processing. Wiley Interscience (2002)
Yegnanarayana, B., Murthy, P.S.: Enhancement of reverberant speech using LP residual signal. IEEE Trans. Speech Audio Process. 8(3), 267–281 (2000)
Yoshioka, T., Hikichi, T., Miyoshi, M.: Dereverberation by using time-variant nature of speech production system. EURASIP J. Advances in Signal Process. 2007(Article ID 65698), doi:10.1155/2007/65698 (2007)
Zhao, Y.: An EM algorithm for linear distortion channel estimation based on observations from a mixture of Gaussian sources. IEEE Trans. Speech Audio Process. 7(4), 400–413 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag London Limited
About this chapter
Cite this chapter
Miyoshi, M., Delcroix, M., Kinoshita, K., Yoshioka, T., Nakatani, T., Hikichi, T. (2010). Inverse Filtering for Speech Dereverberation Without the Use of Room Acoustics Information. In: Naylor, P., Gaubitch, N. (eds) Speech Dereverberation. Signals and Commmunication Technology. Springer, London. https://doi.org/10.1007/978-1-84996-056-4_9
Download citation
DOI: https://doi.org/10.1007/978-1-84996-056-4_9
Publisher Name: Springer, London
Print ISBN: 978-1-84996-055-7
Online ISBN: 978-1-84996-056-4
eBook Packages: EngineeringEngineering (R0)