Confidence Measures in Automatic Speech Recognition Systems for Error Detection in Restricted Domains

Olcoz, Julia; Ortega, Alfonso; Miguel, Antonio; Lleida, Eduardo

doi:10.1007/978-3-319-13623-3_18

Julia Olcoz²³,
Alfonso Ortega²³,
Antonio Miguel²³ &
…
Eduardo Lleida²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8854))

849 Accesses

Abstract

This paper presents the performance achieved using Confidence Measures (CM) in Automatic Speech Recognition (ASR) for the transcription of weather reports from the Spanish public broadcast channel (RTVE). In the CM computation, first Acoustic-Phonetic Decoding (APD) is carried out, then we align reference and hypothesis word sequences through a phone-graph, and finally in this decoding mesh given a time interval, the maximum posterior probability of the hypothesized word is selected as the CM value. The final goal is to use the CM module as an extension of the ASR system to automatically evaluate the reliability of recognition results, discarding low confidence words at the output. These CM can be used as a tool for Unsupervised Learning Techniques, and also for helping human supervision of recognition results. If accurate enough, these CM would increase the usability as well as the robustness of speech applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

The Effect of Word Frequency and Position-in-Utterance in Mandarin Speech Errors: A Connectionist Model of Speech Production

Introduction of Semantic Model to Help Speech Recognition

Error Detection Using Syntactic Analysis for Air Traffic Speech

References

Imseng, D., Potard, B., Motticek, P., Nanchen, A., Bourlard, H.: Exploiting untranscribed foreign data for speech recognition in well-resourced languages. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (2014)
Google Scholar
Vesely, K., Burget, L.: Semi-supervised training of deep neural networks. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp. 267–272 (2013)
Google Scholar
Jiang, H.: Confidence Measures for speech recognition: A survey. Speech Communication 45, 455–470 (2005)
Article Google Scholar
Cox, S., Rose, R.: Confidence Measures for the switchboard database. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing, pp. 511–514 (1996)
Google Scholar
Wessel, F., Schluter, R., Macharey, K., Ney, H.: Confidence Measures for large vocabulary continuous speech recognition. IEEE Transactions on Speech and Audio Processing 9(3), 288–298 (2001)
Article Google Scholar
Lleida, E., Rose, R.: Likelihood ratio decoding and confidence measures for continuous speech recognition. In: Proceeding of the Fourth International Conference on Spoken Language Processing, pp. 478–481 (1996)
Google Scholar
Moreno, A., Poch, D., Bonafonte, A., Lleida, E., Llisterri, J., Mario, J., Nadeu, C.: Albayzin speech database: design of the phonetic corpus. In: EUROSPEECH (1993)
Google Scholar
Moreno, A., Borge, L., Christoph, D., Khalid, C., Stephan, A., Jeffrey, A.: Speech-Dat Car: a large vocabulary speech database for automotive environments. In: Proceedings II LREC (2000)
Google Scholar
Justo, R., Saz, O., Guijarrubia, V., Miguel, A., Torres, M., Lleida, E.: Improving dialogue systems in a home automation environment. In: Proceedings of the First International Conference on Ambient Media and Systems (Ambi-Sys), Quebec City (2008)
Google Scholar
Young, S., Kershaw, D., Odell, J., Ollason, D., Valtchev, V., Woodland, P.: The HTK Book, version 3.4. Microsoft Corporation (1995)
Google Scholar
Stolcke, A.: An Extensible Language Modeling Toolkit. In: International Conference on Spoken Language Processing (ICSLP 2002), Denver (2002)
Google Scholar
Gauvain, J., Chin-Hui, L.: Maximum a posteriori estimation for multivariate gaussian mixture observations of markov chains. IEEE Transactions on Speech and Audio Processing 2(2), 291–299 (1994)
Article Google Scholar
Mohri, M., Riley, M.: Weighted Finite-State Transducers in Speech Recognition. In: International Conference on Spoken Language Processing (ICSLP 2002), Denver (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

ViVoLab, Aragon Institute for Engineering Research (I3A), University of Zaragoza, Spain
Julia Olcoz, Alfonso Ortega, Antonio Miguel & Eduardo Lleida

Authors

Julia Olcoz
View author publications
You can also search for this author in PubMed Google Scholar
Alfonso Ortega
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Miguel
View author publications
You can also search for this author in PubMed Google Scholar
Eduardo Lleida
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ETSIT, Las Palmas de Gran Canaria, Spain
Juan Luis Navarro Mesa , Eduardo Hernández Pérez , Pedro Quintana Morales , Antonio Ravelo García & Iván Guerra Moreno , , , &
University of Zaragoza, Spain
Alfonso Ortega
Dep. of Electronics, Telecommunications and Informatics Engineering, University of Aveiro, Portugal
António Teixeira
ATVS Biometric Recognition Group,, Universidad Autónoma de Madrid, Spain
Doroteo T. Toledano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Olcoz, J., Ortega, A., Miguel, A., Lleida, E. (2014). Confidence Measures in Automatic Speech Recognition Systems for Error Detection in Restricted Domains. In: Navarro Mesa, J.L., et al. Advances in Speech and Language Technologies for Iberian Languages. Lecture Notes in Computer Science(), vol 8854. Springer, Cham. https://doi.org/10.1007/978-3-319-13623-3_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-13623-3_18
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13622-6
Online ISBN: 978-3-319-13623-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Confidence Measures in Automatic Speech Recognition Systems for Error Detection in Restricted Domains

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

The Effect of Word Frequency and Position-in-Utterance in Mandarin Speech Errors: A Connectionist Model of Speech Production

Introduction of Semantic Model to Help Speech Recognition

Error Detection Using Syntactic Analysis for Air Traffic Speech

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Confidence Measures in Automatic Speech Recognition Systems for Error Detection in Restricted Domains

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

The Effect of Word Frequency and Position-in-Utterance in Mandarin Speech Errors: A Connectionist Model of Speech Production

Introduction of Semantic Model to Help Speech Recognition

Error Detection Using Syntactic Analysis for Air Traffic Speech

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation