Automatic Speech Recognition Based on Multiple Level Units in Spoken Dialogue System for In-Vehicle Appliances

Nishida, Masafumi; Horiuchi, Yasuo; Kuroiwa, Shingo; Ichikawa, Akira

doi:10.1007/978-3-642-15760-8_68

Masafumi Nishida²³,
Yasuo Horiuchi²⁴,
Shingo Kuroiwa²⁴ &
…
Akira Ichikawa²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6231))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

1419 Accesses

Abstract

The purpose of our study is to develop a spoken dialogue system for in-vehicle appliances. Such a multi-domain dialogue system should be capable of reacting to a change of the topic, recognizing fast and accurately separating words as well as whole sentences. We propose a novel recognition method by integrating a sentence, partial words, and phonemes. The degree of confidence is determined by the degree to which recognition results match on these three levels. We conducted speech recognition experiments for in-vehicle appliances. In the case of sentence units, the recognition accuracy was 96.2% by the proposed method and 92.9% by the conventional word bigram. As for word units, recognition accuracy of the proposed method was 86.2% while that of whole word recognition was 75.1%. Therefore, we concluded that our method can be effectively applied in spoken dialogue systems for in-vehicle appliances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Lane, I. R., Kawahara, T., Matsui, T.: Language Model Switching Based on Topic Detection For Dialog Speech Recognition. In: Proc. ICASSP, pp. 616–619 (2003)
Google Scholar
Hazen, T.J., Seneff, S., Polifroni, J.: Recognition Confidence Scoring and Its Use in Speech Understanding Systems. Computer Speech and Language 16, 49–67 (2002)
Article Google Scholar
Hirschberg, J., Litman, D., Swerts, M.: Prosodic and Other Cues to Speech Recognition Failures. Speech Communication 43, 155–175 (2004)
Article Google Scholar
Raymond, C., Bechet, F., Mori, R. D., Damnati, G., Esteve, Y.: Automatic Learning of Interpretation Strategies for Spoken Dialogue Systems. In: Proc. ICASSP, vol. 1, pp. 425–428 (2004)
Google Scholar
Fiscus, J. G.: A Post-Processing System to Yield Reduced Error Word Rates: Recognizer Output Voting Error Reduction. In: Proc. ASRU, pp. 347–354 (1997)
Google Scholar
Schwenk, H., Gauvain, J. L.: Combining Multiple Speech Recognizers Using Voting and Language Model Information. In: Proc. ICSLP, vol. 2, pp. 915–918 (2000)
Google Scholar
Lee, A., Kawahara, T., Shikano, K.: Julius – an Open Source Real-Time Large Vocabulary Recognition Engine. In: Proc. EUROSPEECH, pp. 1691–1694 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Science and Engineering, Doshisha University, Kyoto, Japan
Masafumi Nishida
Graduate School of Advanced Integration Science, Chiba University, Chiba, Japan
Yasuo Horiuchi & Shingo Kuroiwa
Faculty of Human Sciences, Waseda University, Saitama, Japan
Akira Ichikawa

Authors

Masafumi Nishida
View author publications
You can also search for this author in PubMed Google Scholar
Yasuo Horiuchi
View author publications
You can also search for this author in PubMed Google Scholar
Shingo Kuroiwa
View author publications
You can also search for this author in PubMed Google Scholar
Akira Ichikawa
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics, Masaryk University, Brno, Czech Republic
Petr Sojka
Faculty of Informatics, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Aleš Horák
Faculty of Informatics, Masaryk University, Botanická 68a, CZ-602 00, Brno, Czech Republic
Ivan Kopeček
Faculty of Informatics, Department of Computer Graphics and Design, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nishida, M., Horiuchi, Y., Kuroiwa, S., Ichikawa, A. (2010). Automatic Speech Recognition Based on Multiple Level Units in Spoken Dialogue System for In-Vehicle Appliances. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2010. Lecture Notes in Computer Science(), vol 6231. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15760-8_68

Download citation

DOI: https://doi.org/10.1007/978-3-642-15760-8_68
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15759-2
Online ISBN: 978-3-642-15760-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics