In-Car Speech Recognition Using Distributed Multiple Microphones

Li, Weifeng; Nishino, Takanori; Miyajima, Chiyomi; Itou, Katsunobu; Takeda, Kazuya; Itakura, Fumitada

doi:10.1007/978-3-540-30541-5_62

Weifeng Li¹⁹,
Takanori Nishino¹⁹,
Chiyomi Miyajima¹⁹,
Katsunobu Itou¹⁹,
Kazuya Takeda¹⁹ &
…
Fumitada Itakura²⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3331))

Included in the following conference series:

Pacific-Rim Conference on Multimedia

931 Accesses

Abstract

This paper describes a new multi-channel method of noisy speech recognition, which estimates the log spectrum of speech at a close-talking microphone based on the multiple regression of the log spectra (MRLS) of noisy signals captured by the distributed microphones. The advantages of the proposed method are as follows:

1
The method does not make any assumptions about the positions of the speaker and noise sources with respect to the microphones. Therefore, the system can be trained for various sitting positions of drivers.
2
The regression weights can be statistically optimized over a certain length of speech segments (e.g., sentences of speech) under particular road conditions. The performance of the proposed method is illustrated by speech recognition of real in-car dialogue data. In comparison to the nearest distant microphone and multi-microphone adaptive beamformer, the proposed approach obtains relative word error rate (WER) reductions of 9.8% and 3.6% respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Junqua, C., Haton, P.: Robustness in Automatic Speech Recognition. Kluwer Academic Publishers, Dordrecht (1996)
Google Scholar
Griffiths, L.J., Jim, C.W.: An Alternative Approach to Linearly Constrained Adaptive Beamforming. IEEE Trans. on Antennas and Propagation AP-30(1), 27–34 (1982)
Article Google Scholar
Brandstein, M., Ward, D.: Microphone Arrays: Signal Processing Techniques and Applications. Springer, Heidelberg (2001)
Google Scholar
Shimizu, Y., Kajita, S., Takeda, K., Itakura, F.: Speech Recognition Based on Space Diversity Using Distributed Multi-Microphone. In: Proc. IEEE ICASSP, vol. III, pp. 1747–1750 (2000)
Google Scholar
Seltzer, M.L., Raj, B., Stern, R.M.: Speech Recognizer-based microphone array processing for robust hands-free speech recognition. In: Proc. IEEE ICASSP, vol. I, pp. 897–900 (2002)
Google Scholar
Haykin, S.: Adaptive Filter theory. Prentice-Hall, Englewood Cliffs (2002)
Google Scholar
http://htk.eng.cam.ac.uk
Fujimura, H., Itou, K., Takeda, K., Itakura, F.: In-car speech recognition experiments using a large-scale multi-mode dialogue corpus. In: Proc. ICA, vol. IV, pp. 2583–2586 (2004)
Google Scholar
http://julius.sourceforge.jp

Download references

Author information

Authors and Affiliations

Nagoya University, Nagoya, 464–8603, Japan
Weifeng Li, Takanori Nishino, Chiyomi Miyajima, Katsunobu Itou & Kazuya Takeda
Meijo University, Nagoya, 468-8502, Japan
Fumitada Itakura

Authors

Weifeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Takanori Nishino
View author publications
You can also search for this author in PubMed Google Scholar
Chiyomi Miyajima
View author publications
You can also search for this author in PubMed Google Scholar
Katsunobu Itou
View author publications
You can also search for this author in PubMed Google Scholar
Kazuya Takeda
View author publications
You can also search for this author in PubMed Google Scholar
Fumitada Itakura
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Information and Communication Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, 113-8656, Tokyo, Japan
Kiyoharu Aizawa
IBM Research, Tokyo Research Laboratory, 1623-14 Shimo-tsuruma, Yamato, 242-0001, Kanagawa, Japan
Yuichi Nakamura
National Institute of Informatics, Tokyo, Japan
Shin’ichi Satoh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, W., Nishino, T., Miyajima, C., Itou, K., Takeda, K., Itakura, F. (2004). In-Car Speech Recognition Using Distributed Multiple Microphones. In: Aizawa, K., Nakamura, Y., Satoh, S. (eds) Advances in Multimedia Information Processing - PCM 2004. PCM 2004. Lecture Notes in Computer Science, vol 3331. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30541-5_62

Download citation

DOI: https://doi.org/10.1007/978-3-540-30541-5_62
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23974-1
Online ISBN: 978-3-540-30541-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics