Abstract
The purpose of our study is to develop a spoken dialogue system for in-vehicle appliances. Such a multi-domain dialogue system should be capable of reacting to a change of the topic, recognizing fast and accurately separating words as well as whole sentences. We propose a novel recognition method by integrating a sentence, partial words, and phonemes. The degree of confidence is determined by the degree to which recognition results match on these three levels. We conducted speech recognition experiments for in-vehicle appliances. In the case of sentence units, the recognition accuracy was 96.2% by the proposed method and 92.9% by the conventional word bigram. As for word units, recognition accuracy of the proposed method was 86.2% while that of whole word recognition was 75.1%. Therefore, we concluded that our method can be effectively applied in spoken dialogue systems for in-vehicle appliances.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lane, I. R., Kawahara, T., Matsui, T.: Language Model Switching Based on Topic Detection For Dialog Speech Recognition. In: Proc. ICASSP, pp. 616–619 (2003)
Hazen, T.J., Seneff, S., Polifroni, J.: Recognition Confidence Scoring and Its Use in Speech Understanding Systems. Computer Speech and Language 16, 49–67 (2002)
Hirschberg, J., Litman, D., Swerts, M.: Prosodic and Other Cues to Speech Recognition Failures. Speech Communication 43, 155–175 (2004)
Raymond, C., Bechet, F., Mori, R. D., Damnati, G., Esteve, Y.: Automatic Learning of Interpretation Strategies for Spoken Dialogue Systems. In: Proc. ICASSP, vol. 1, pp. 425–428 (2004)
Fiscus, J. G.: A Post-Processing System to Yield Reduced Error Word Rates: Recognizer Output Voting Error Reduction. In: Proc. ASRU, pp. 347–354 (1997)
Schwenk, H., Gauvain, J. L.: Combining Multiple Speech Recognizers Using Voting and Language Model Information. In: Proc. ICSLP, vol. 2, pp. 915–918 (2000)
Lee, A., Kawahara, T., Shikano, K.: Julius – an Open Source Real-Time Large Vocabulary Recognition Engine. In: Proc. EUROSPEECH, pp. 1691–1694 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nishida, M., Horiuchi, Y., Kuroiwa, S., Ichikawa, A. (2010). Automatic Speech Recognition Based on Multiple Level Units in Spoken Dialogue System for In-Vehicle Appliances. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2010. Lecture Notes in Computer Science(), vol 6231. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15760-8_68
Download citation
DOI: https://doi.org/10.1007/978-3-642-15760-8_68
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15759-2
Online ISBN: 978-3-642-15760-8
eBook Packages: Computer ScienceComputer Science (R0)