Abstract
This paper describes a new multi-channel method of noisy speech recognition, which estimates the log spectrum of speech at a close-talking microphone based on the multiple regression of the log spectra (MRLS) of noisy signals captured by the distributed microphones. The advantages of the proposed method are as follows:
-
1
The method does not make any assumptions about the positions of the speaker and noise sources with respect to the microphones. Therefore, the system can be trained for various sitting positions of drivers.
-
2
The regression weights can be statistically optimized over a certain length of speech segments (e.g., sentences of speech) under particular road conditions. The performance of the proposed method is illustrated by speech recognition of real in-car dialogue data. In comparison to the nearest distant microphone and multi-microphone adaptive beamformer, the proposed approach obtains relative word error rate (WER) reductions of 9.8% and 3.6% respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Junqua, C., Haton, P.: Robustness in Automatic Speech Recognition. Kluwer Academic Publishers, Dordrecht (1996)
Griffiths, L.J., Jim, C.W.: An Alternative Approach to Linearly Constrained Adaptive Beamforming. IEEE Trans. on Antennas and Propagation AP-30(1), 27–34 (1982)
Brandstein, M., Ward, D.: Microphone Arrays: Signal Processing Techniques and Applications. Springer, Heidelberg (2001)
Shimizu, Y., Kajita, S., Takeda, K., Itakura, F.: Speech Recognition Based on Space Diversity Using Distributed Multi-Microphone. In: Proc. IEEE ICASSP, vol. III, pp. 1747–1750 (2000)
Seltzer, M.L., Raj, B., Stern, R.M.: Speech Recognizer-based microphone array processing for robust hands-free speech recognition. In: Proc. IEEE ICASSP, vol. I, pp. 897–900 (2002)
Haykin, S.: Adaptive Filter theory. Prentice-Hall, Englewood Cliffs (2002)
Fujimura, H., Itou, K., Takeda, K., Itakura, F.: In-car speech recognition experiments using a large-scale multi-mode dialogue corpus. In: Proc. ICA, vol. IV, pp. 2583–2586 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, W., Nishino, T., Miyajima, C., Itou, K., Takeda, K., Itakura, F. (2004). In-Car Speech Recognition Using Distributed Multiple Microphones. In: Aizawa, K., Nakamura, Y., Satoh, S. (eds) Advances in Multimedia Information Processing - PCM 2004. PCM 2004. Lecture Notes in Computer Science, vol 3331. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30541-5_62
Download citation
DOI: https://doi.org/10.1007/978-3-540-30541-5_62
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23974-1
Online ISBN: 978-3-540-30541-5
eBook Packages: Computer ScienceComputer Science (R0)