Abstract
SM has shown a better performance than HMM in connected word recognition system; however, no reports we have read show that SM has been applied in LVCSR as decoding acoustic model because of the restriction of its complexity. We have preliminarily built a SM based mandarin LVCSR system which adopts CART and global tying to tie the parameters in the triphone models and the fast SM algorithm, CF algorithm and two-level pruning to enhance the speed of decoding. The system achieves 87.09% syllable accuracy in Test-863 data corpus within 4 real times. We believe SM offers an alternative choice for LVCSR system though further research for its fast algorithms by rational utilization of its structure information.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ostendorf, M., Digalakis, V., Kimball, O.: From HMM’s to Segment Models: A Unified View of Stochastic Modeling for Speech Recognition. IEEE Trans. on Speech and Audio Processing 4(5), 360–378 (1996)
Huang, X.D., Acero, A., Hon, H.W.: Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Prentice Hall PTR, Englewood Cliffs (2001)
Ostendorf, M., Roukos, S.: A Stochastic Segment Model for Phoneme Based Continuous Speech Recognition. IEEE Trans. on Acoustic, Speech and Signal Processing. 37(12), 1857–1869 (1989)
Tang, Y., Liu, W.J., Zhang, Y.Y., Xu, B.: A Framework for Fast Segment Model by Avoidance of Redundant Computation on Segment. In: ISCSLP, Hong Kong, pp. 117–120 (2004)
Digalakis, V., Ostendorf, M., Rohlicek, J.: Fast Algorithms for Phone Classification and Recognition Using Segment-based Models. IEEE Trans. on Signal Processing 40(12), 2885–2896 (1992)
Gao, S., et al.: Acoustic Modeling for Chinese Speech Recognition: A Comparative Study of Mandarin and Cantonese. In: ICASSP, Istanbul, pp. 967–970 (2000)
Ney, H., Ortmanns, S.: Progress in Dynamic Programming Search for LVCSR. Proceedings of the IEEE 88(8), 1224–1240 (2000)
Young, S., et al.: The HTK Book, Cambridge (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, W., Tang, Y., Peng, S. (2009). Research on Segment Acoustic Model Based Mandarin LVCSR. In: Yu, W., He, H., Zhang, N. (eds) Advances in Neural Networks – ISNN 2009. ISNN 2009. Lecture Notes in Computer Science, vol 5552. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01510-6_105
Download citation
DOI: https://doi.org/10.1007/978-3-642-01510-6_105
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01509-0
Online ISBN: 978-3-642-01510-6
eBook Packages: Computer ScienceComputer Science (R0)