Abstract
This paper presents the performance achieved using Confidence Measures (CM) in Automatic Speech Recognition (ASR) for the transcription of weather reports from the Spanish public broadcast channel (RTVE). In the CM computation, first Acoustic-Phonetic Decoding (APD) is carried out, then we align reference and hypothesis word sequences through a phone-graph, and finally in this decoding mesh given a time interval, the maximum posterior probability of the hypothesized word is selected as the CM value. The final goal is to use the CM module as an extension of the ASR system to automatically evaluate the reliability of recognition results, discarding low confidence words at the output. These CM can be used as a tool for Unsupervised Learning Techniques, and also for helping human supervision of recognition results. If accurate enough, these CM would increase the usability as well as the robustness of speech applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Imseng, D., Potard, B., Motticek, P., Nanchen, A., Bourlard, H.: Exploiting untranscribed foreign data for speech recognition in well-resourced languages. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (2014)
Vesely, K., Burget, L.: Semi-supervised training of deep neural networks. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp. 267–272 (2013)
Jiang, H.: Confidence Measures for speech recognition: A survey. Speech Communication 45, 455–470 (2005)
Cox, S., Rose, R.: Confidence Measures for the switchboard database. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing, pp. 511–514 (1996)
Wessel, F., Schluter, R., Macharey, K., Ney, H.: Confidence Measures for large vocabulary continuous speech recognition. IEEE Transactions on Speech and Audio Processing 9(3), 288–298 (2001)
Lleida, E., Rose, R.: Likelihood ratio decoding and confidence measures for continuous speech recognition. In: Proceeding of the Fourth International Conference on Spoken Language Processing, pp. 478–481 (1996)
Moreno, A., Poch, D., Bonafonte, A., Lleida, E., Llisterri, J., Mario, J., Nadeu, C.: Albayzin speech database: design of the phonetic corpus. In: EUROSPEECH (1993)
Moreno, A., Borge, L., Christoph, D., Khalid, C., Stephan, A., Jeffrey, A.: Speech-Dat Car: a large vocabulary speech database for automotive environments. In: Proceedings II LREC (2000)
Justo, R., Saz, O., Guijarrubia, V., Miguel, A., Torres, M., Lleida, E.: Improving dialogue systems in a home automation environment. In: Proceedings of the First International Conference on Ambient Media and Systems (Ambi-Sys), Quebec City (2008)
Young, S., Kershaw, D., Odell, J., Ollason, D., Valtchev, V., Woodland, P.: The HTK Book, version 3.4. Microsoft Corporation (1995)
Stolcke, A.: An Extensible Language Modeling Toolkit. In: International Conference on Spoken Language Processing (ICSLP 2002), Denver (2002)
Gauvain, J., Chin-Hui, L.: Maximum a posteriori estimation for multivariate gaussian mixture observations of markov chains. IEEE Transactions on Speech and Audio Processing 2(2), 291–299 (1994)
Mohri, M., Riley, M.: Weighted Finite-State Transducers in Speech Recognition. In: International Conference on Spoken Language Processing (ICSLP 2002), Denver (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Olcoz, J., Ortega, A., Miguel, A., Lleida, E. (2014). Confidence Measures in Automatic Speech Recognition Systems for Error Detection in Restricted Domains. In: Navarro Mesa, J.L., et al. Advances in Speech and Language Technologies for Iberian Languages. Lecture Notes in Computer Science(), vol 8854. Springer, Cham. https://doi.org/10.1007/978-3-319-13623-3_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-13623-3_18
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13622-6
Online ISBN: 978-3-319-13623-3
eBook Packages: Computer ScienceComputer Science (R0)