Did You Say What I Think You Said?

Ludwig, Bernd; Hitzenberger, Ludwig

doi:10.1007/978-3-642-32790-2_52

Bernd Ludwig²¹ &
Ludwig Hitzenberger²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7499))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

1648 Accesses

Abstract

In this paper we discuss the problem that in a dialogue system, speech recognizers should be able to guess whether the speech recognition failed, even if no correct transcription of the actual user utterance is available. Only with such a diagnosis available, the dialogue system can choose an adequate repair strategy and try to recover from the interaction problem with the user and avoid negative consequences for the successful completion of the dialogue. We present a data collection for a controlled out-of-vocabulary scenario and discuss an approach to estimate the success of a speech recognizer’s results by exploring differences between the N-gram distribution in the best word chain and in the language model. We present the results of our experiments that indicate that differences can be found to be significant if the speech recognition failed severely. From these results, we derive a quick test for failed recognition that is based on a negative language model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chang, J.C., Lien, A., Lathrop, B., Hees, H.: Usability evaluation of a volkswagen group in-vehicle speech system. In: Schmidt, A., Dey, A.K., Seder, T., Juhlin, O. (eds.) Automotive UI, pp. 137–144. ACM (2009)
Google Scholar
Chelba, C., Jelinek, F.: Structured language modeling. Computer Speeech & Language 14, 283–332 (2000)
Article Google Scholar
Bocchieri, E., Dimitriadis, D.C.D.: Speech recognition modeling advances for mobile voice search. In: Proceedings of Acoustics, Speech and Signal Processing (ICASSP 2011), Prague, pp. 4888–4891 (2011)
Google Scholar
Chen, L., Chin, K.K., Knill, K.: Improved language modelling using bag of word pairs. In: Proceedings of Interspeech 2009, Brighton, pp. 2671–2674 (2009)
Google Scholar
Jurafsky, D., Martin, J.H.: Speech and Language Processing. Prentice Hall (2009)
Google Scholar
Hacker, M.: Context-aware speech recognition in a robot navigation scenario. In: Proceedings of the 2nd Workshop on Context Aware Intelligent Assistance, pp. 4–15 (2012)
Google Scholar
Katz, S.M.: Estimation of probabilities from sparse data for the language model component of a speech recogniser. IEEE Transactions on Acoustics, Speech, and Signal Processing 35, 400–401 (1987)
Article Google Scholar
Chelba, C., Brants, T., Neveitt, W., Xu, P.: Study on interaction between entropy pruning and kneser-ney smoothing. In: Proceedings of Interspeech 2010, pp. 2242–2245 (2010)
Google Scholar
Uhrik, C., Ward, W.: Confidence Metrics Based on N-Gram Language Model Backoff Behaviors. In: Fifth European Conference on Speech Communication and Technology. ISCA (1997)
Google Scholar
Jiang, H.: Confidence measures for speech recognition: A survey. Speech Communication 45, 455–470 (2005)
Article Google Scholar
Katz, S.: Estimation of probabilities from sparse data for the language model component of a speech recogniser. IEEE Transactions on Acoustics, Speech, and Signal Processing 35, 400–401 (1987)
Article Google Scholar
Spiegl, W., Riedhammer, K., Steidl, S., Nöth, E.: FAU IISAH Corpus – A German Speech Database Consisting of Human-Machine and Human-Human Interaction Acquired by Close-Talking and Far-Distance Microphones. In: Proceedings of the Seventh Conference on International Language Resources and Evaluation (LREC 2010), pp. 2420–2423. ELRA (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Chair for Information Science, University Regensburg, Universitätstraße 31, D-93047, Regensburg, Germany
Bernd Ludwig & Ludwig Hitzenberger

Authors

Bernd Ludwig
View author publications
You can also search for this author in PubMed Google Scholar
Ludwig Hitzenberger
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics, Department of Computer Graphics and Design, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Petr Sojka
Faculty of Informatics, Department of Information Technologies, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Aleš Horák , Ivan Kopeček & Karel Pala , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ludwig, B., Hitzenberger, L. (2012). Did You Say What I Think You Said?. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2012. Lecture Notes in Computer Science(), vol 7499. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32790-2_52

Download citation

DOI: https://doi.org/10.1007/978-3-642-32790-2_52
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32789-6
Online ISBN: 978-3-642-32790-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics