A Comparison of Acoustic Models Based on Neural Networks and Gaussian Mixtures

Pavelka, Tomáš; Ekštein, Kamil

doi:10.1007/978-3-642-04208-9_41

A Comparison of Acoustic Models Based on Neural Networks and Gaussian Mixtures

Tomáš Pavelka²¹ &
Kamil Ekštein²¹

Conference paper

849 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5729))

Abstract

This article tries to compare the performance of neural network and Gaussian mixture acoustic models (GMMs). We have carried out tests which match up various models in terms of speed and achieved recognition accuracy. Since the speed-accuracy trade-off is not only dependent on the acoustic model itself, but also on the settings of decoder parameters, we have suggested a comparison based on equal number of active states during the decoding search. Statistical significance measures are also discussed and a new method for confidence interval computation is introduced.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bourlard, H., Morgan, N.: Hybrid HMM/ANN Systems for Speech Recognition: Overview and New Research Directions, Summer School on Neural Networks (1997)
Google Scholar
Hejtmánek, J., Pavelka, T.: Use of context-dependent units in Czech speech. In: Proc. of PhD Workshop 2007, Balatonfüred, Hungary (2007)
Google Scholar
Odell, J.J.: The Use of Context in Large Vocabulary Speech Recognition, PhD Thesis, Cambridge University Engineering Dept. (1995)
Google Scholar
Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE 77(2) (1989)
Google Scholar
Pavelka, T., Ekštein, K.: Neural Network Acoustic Model for Recognition of Czech Speech. In: Proc. of PhD Workshop Systems & Control, Izola, Slovenia (2005)
Google Scholar
Pavelka, T., Ekštein, K.: JLASER: An Automatic Speech Recognizer Written in Java. In: Proc. of XII International Conference Speech and Computer (SPECOM 2007), Moscow, Russia (2007)
Google Scholar
Pavelka, T., Král, P.: Neural Network Acoustic Model with Decision Tree Clustered Triphones. In: Proceedings of 2008 IEEE International Workshop on Machine Learning for Signal Processing, Cancún, Mexico (2008)
Google Scholar
Tebelskis, J.: Speech Recognition using Neural Networks, PhD Thesis, Carnegie Mellon University (1995)
Google Scholar
Young, S., et al.: The HTK Book (for HTK v. 3.3), Cambridge University Engineering Dept. (2002)
Google Scholar
Vávra, F., Pavelka, T., Šedivá, B., Vokáčová, K., Marek, P., Neumanová, M.: Ratio Statistics. In: Proceedings of JČMF ROBUST 2008, Pribylina, Slovakia (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Applied Sciences, Deptartment of Computer Science and Engineering, University of West Bohemia, Czech Republic
Tomáš Pavelka & Kamil Ekštein

Authors

Tomáš Pavelka
View author publications
You can also search for this author in PubMed Google Scholar
Kamil Ekštein
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Wet Bohemia at Pilsen, Czech Republic
Václav Matoušek
Department of Computer Science, University of West Bohemia in Pilsen, Univerzitni 8, 30614, Plzen, Czech Republic
Pavel Mautner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pavelka, T., Ekštein, K. (2009). A Comparison of Acoustic Models Based on Neural Networks and Gaussian Mixtures. In: Matoušek, V., Mautner, P. (eds) Text, Speech and Dialogue. TSD 2009. Lecture Notes in Computer Science(), vol 5729. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04208-9_41

Download citation

DOI: https://doi.org/10.1007/978-3-642-04208-9_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04207-2
Online ISBN: 978-3-642-04208-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics