Abstract
We propose high-order hidden Markov models (HO-HMM) to capture the duration and dynamics of speech signal. In the proposed model, both the state transition probability and the output observation probability depend not only on the current state but also on several previous states. An extended Viterbi algorithm was developed to train model and recognize speech. The performance of the HO-HMM was investigated by conducting experiments on speaker independent Mandarin digits recognition. From the experimental results, we find that as the order of HO-HMM increases, the number of error reduces. We also find that systems with both high-order state transition probability distribution and output observation probability distribution outperform systems with only high-order state transition probability distribution.
This research work was supported by National Science Council, Republic of China, under the Grant NSC93-2213-E-212-012. The Authors would like to thank National Center for High-performance Computing for providing computation power in this work. Thanks are also given to Chunghwa Telecom Laboratories, Chunghwa Telecom Co., Ltd., for providing the experimental database.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77, 257–285 (1991)
Levinson, S.E.: Continuously variable duration hidden Markov models for automatic speech recognition. Computer Speech and Language 1(1), 29–45 (1986)
Russell, M.J., Cook, A.: Experimental evaluation of duration modeling techniques for automatic speech recognition. In: Proc. IEEE ICASSP, pp. 2376–2379 (1987)
Furui, S.: Speaker independent isolated word recognition using dynamic features of speech spectrum. IEEE Trans. Acoust., Speech, Signal Processing, 52–59 (1986)
Atal, B.S.: Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. J. Acoust. Soc. Am. 55(6), 1304–1312 (1974)
Gales, M.J.F.: Maximum Likelihood Linear Transformations for HMM-based Speech Recognition, Tech. Report, CUED/FINFENG/TR291, Cambridge Univ. (1997)
Bahl, L.R., de Souza, P.V., Gopalakrishnan, P.S., Nahamoo, D., Picheny, M.A.: Decision Trees for Phonological Rules in Continuous Speech. In: Proc. of the IEEE ICASSP, Toronto, Canada, pp. 185–188 (1991)
Mari, J.-F., Haton, J.-P., Kriouile, A.: Automatic word recognition based on second-order hidden Markov models. IEEE Transactions on Speech and Audio Processing 5(1), 22–25 (1997)
du Preez, J.A.: Algorithms for high order hidden Markov modeling. In: Proceedings of the IEEE South African Symposium on Communications and Signal Processing, September 9-10, pp. 101–106 (1997)
Deng, L., Aksmanovic, M., Sun, D., Wu, C.F.J.: Speech recognition using hidden Markov models with polynomial regression functions as nonstationary states. IEEE Transactions on Speech and Audio Processing 2(4), 507–520 (1994)
Linde, Y., Buzo, A., Gray, R.M.: An Algorithm for Vector Quantizer Design. IEEE Transactions on Communications, 702–710 (1980)
He, Y.: Extended Viterbi algorithm for second-order hidden Markov process. In: Proceedings of the IEEE 9th International Conference on Pattern Recognition, pp. 718–720 (1988)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, LM., Lee, JC. (2006). A Study on High-Order Hidden Markov Models and Applications to Speech Recognition. In: Ali, M., Dapoigny, R. (eds) Advances in Applied Artificial Intelligence. IEA/AIE 2006. Lecture Notes in Computer Science(), vol 4031. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11779568_74
Download citation
DOI: https://doi.org/10.1007/11779568_74
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35453-6
Online ISBN: 978-3-540-35454-3
eBook Packages: Computer ScienceComputer Science (R0)