Design of Tandem Architecture Using Segmental Trend Features

Yun, Young-Sun; Lee, Yunkeun

doi:10.1007/978-3-540-74628-7_49

Young-Sun Yun¹ &
Yunkeun Lee²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4629))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

1735 Accesses

Abstract

This paper investigates the tandem architecture (TA) based on segmental features. The segmental feature based recognition system has been reported to show better results than the conventional feature based system in previous studies. In this paper we tried to merge the segmental feature with the tandem architecture which uses both hidden Markov models and neural networks. In general, segmental features can be separated into the trend and location. Since the trend means variation of segmental features and since it occupies a large portion of segmental features, the trend information was used as an independent or additional feature for the speech recognition system. We applied the trend information of segmental features to TA and used posterior probabilities, which are the output of the neural network, as inputs of the recognition system. Experiments were performed on Aurora2 database to examine the potentiality of the trend feature based TA. The results of our experiments verified that the proposed system outperforms the conventional system on very low SNR environments. These findings led us to conclude that the trend information on TA can be additionally used for the traditional MFCC features.

He was a visiting researcher at ETRI from May 2006 to Feb. 2007.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Deng, L.: A generalized hidden Markov model with state-conditioned trend functions of time for the speech signal. Signal Processing 27, 65–78 (1992)
Article MATH Google Scholar
Gish, H., Ng, K.: A segmental speech model with application to word spotting. In: Proc. of Int. Conf. on Acoustics, Speech and Signal Proc. vol. II, pp. 447–450 (1993)
Google Scholar
Gales, M.J.F., Young, S.J.: Segmental Hidden Markov Models. In: Proc. of European Conf. on Speech Comm. and Tech., pp. 1579–1582 (1993)
Google Scholar
Gish, H., Ng, K.: Parametric trajectory models for speech recognition. In: Proc. of Int. Conf. on Spoken Lang. Proc. vol. I, pp. 466–469 (1996)
Google Scholar
Ostendorf, M., Digalakis, V., Kimball, O.A.: From HMMs to Segment Models: A Unified View of Stochastic Modeling for Speech Recognition. IEEE Tr. on Speech and Audio Processing 4(5), 360–378 (1996)
Article Google Scholar
Holmes, W.J., Russell, M.J.: Probabilistic trajectory segmental HMMs. Computer Speech and Language 13, 3–37 (1999)
Article Google Scholar
Yun, Y.S., Oh, Y.H.: A Segmental-Feature HMM for Speech Pattern Modeling. IEEE Signal Processing Letters 7(6), 135–137 (2000)
Article Google Scholar
Yun, Y.S., Oh, Y.H.: A Segmental-Feature HMM for Continuous Speech Recognition Based On a Parametric Trajectory Model. Speech Communication 38(1), 115–130 (2002)
Article MATH Google Scholar
Yun, Y.S.: Sharing Trend Information of Trajectory in Segmental Feature HMM. In: Proc. of Int. Conf. On Spoken Language Proc., Denver, Colorado, USA, pp. 2641–2644 (2002)
Google Scholar
Hermansky, H., Ellis, D., Sharma, S.: Tandem connectionist feature extraction for conventional HMM systems. In: Proc. of Int. Conf. on Acoustics, Speech and Signal Proc., Istanbul, Turkey, pp. 1635–1638 (2000)
Google Scholar
Ellis, D.W.P., Singh, R., Sivadas, S.: Tandem acoustic modeling in large-vocabulary recognition. In: Proc. of Int. Conf. on Acoustics, Speech and Signal Proc., Salt Lake City, USA, pp. 517–520 (2001)
Google Scholar
Sivadas, S., Hermansky, H.: Generalized Tandem Feature Extraction. In: Proc. of Int. Conf. on Acoustics, Speech and Signal Proc. vol. II, pp. 56–59 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Information and Communication Engineering, Hannam University, Daejeon, Republic of Korea
Young-Sun Yun
Spoken Language Processing Team, ETRI, Daejeon, Republic of Korea
Yunkeun Lee

Authors

Young-Sun Yun
View author publications
You can also search for this author in PubMed Google Scholar
Yunkeun Lee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Václav Matoušek Pavel Mautner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yun, YS., Lee, Y. (2007). Design of Tandem Architecture Using Segmental Trend Features. In: Matoušek, V., Mautner, P. (eds) Text, Speech and Dialogue. TSD 2007. Lecture Notes in Computer Science(), vol 4629. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74628-7_49

Download citation

DOI: https://doi.org/10.1007/978-3-540-74628-7_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74627-0
Online ISBN: 978-3-540-74628-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics