Research on Segment Acoustic Model Based Mandarin LVCSR

Liu, Wenju; Tang, Yun; Peng, Shouye

doi:10.1007/978-3-642-01510-6_105

Wenju Liu¹⁹,
Yun Tang¹⁹ &
Shouye Peng¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5552))

Included in the following conference series:

International Symposium on Neural Networks

1373 Accesses

Abstract

SM has shown a better performance than HMM in connected word recognition system; however, no reports we have read show that SM has been applied in LVCSR as decoding acoustic model because of the restriction of its complexity. We have preliminarily built a SM based mandarin LVCSR system which adopts CART and global tying to tie the parameters in the triphone models and the fast SM algorithm, CF algorithm and two-level pruning to enhance the speed of decoding. The system achieves 87.09% syllable accuracy in Test-863 data corpus within 4 real times. We believe SM offers an alternative choice for LVCSR system though further research for its fast algorithms by rational utilization of its structure information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Automatic Phonetic Segmentation and Pronunciation Detection with Various Approaches of Acoustic Modeling

Unified Simplified Grapheme Acoustic Modeling for Medieval Latin LVCSR

LSTM-Based Speech Segmentation Trained on Different Foreign Languages

References

Ostendorf, M., Digalakis, V., Kimball, O.: From HMM’s to Segment Models: A Unified View of Stochastic Modeling for Speech Recognition. IEEE Trans. on Speech and Audio Processing 4(5), 360–378 (1996)
Article Google Scholar
Huang, X.D., Acero, A., Hon, H.W.: Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Prentice Hall PTR, Englewood Cliffs (2001)
Google Scholar
Ostendorf, M., Roukos, S.: A Stochastic Segment Model for Phoneme Based Continuous Speech Recognition. IEEE Trans. on Acoustic, Speech and Signal Processing. 37(12), 1857–1869 (1989)
Article Google Scholar
Tang, Y., Liu, W.J., Zhang, Y.Y., Xu, B.: A Framework for Fast Segment Model by Avoidance of Redundant Computation on Segment. In: ISCSLP, Hong Kong, pp. 117–120 (2004)
Google Scholar
Digalakis, V., Ostendorf, M., Rohlicek, J.: Fast Algorithms for Phone Classification and Recognition Using Segment-based Models. IEEE Trans. on Signal Processing 40(12), 2885–2896 (1992)
Article MATH Google Scholar
Gao, S., et al.: Acoustic Modeling for Chinese Speech Recognition: A Comparative Study of Mandarin and Cantonese. In: ICASSP, Istanbul, pp. 967–970 (2000)
Google Scholar
Ney, H., Ortmanns, S.: Progress in Dynamic Programming Search for LVCSR. Proceedings of the IEEE 88(8), 1224–1240 (2000)
Article Google Scholar
Young, S., et al.: The HTK Book, Cambridge (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, P.O.Box 2728, Beijing, 100190, China
Wenju Liu, Yun Tang & Shouye Peng

Authors

Wenju Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yun Tang
View author publications
You can also search for this author in PubMed Google Scholar
Shouye Peng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departamento de Control Automático,, CINVESTAV-IPN,, A.P. 14-740, Av.IPN 2508,, D.F., 07360,, México, México
Wen Yu
Deptartment of Electrical and Computer Engineering,, Stevens Institute of Technology,, NJ 07030,, Hoboken,, USA
Haibo He
Dept. of Electrical and Computer Engineering,, South Dakota School of Mines & Technology,, 501 E. St. Joseph Street,, SD 57701,, Rapid City,, USA
Nian Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, W., Tang, Y., Peng, S. (2009). Research on Segment Acoustic Model Based Mandarin LVCSR. In: Yu, W., He, H., Zhang, N. (eds) Advances in Neural Networks – ISNN 2009. ISNN 2009. Lecture Notes in Computer Science, vol 5552. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01510-6_105

Download citation

DOI: https://doi.org/10.1007/978-3-642-01510-6_105
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01509-0
Online ISBN: 978-3-642-01510-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics