Skip to main content

Design of Tandem Architecture Using Segmental Trend Features

  • Conference paper
Text, Speech and Dialogue (TSD 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4629))

Included in the following conference series:

  • 1735 Accesses

Abstract

This paper investigates the tandem architecture (TA) based on segmental features. The segmental feature based recognition system has been reported to show better results than the conventional feature based system in previous studies. In this paper we tried to merge the segmental feature with the tandem architecture which uses both hidden Markov models and neural networks. In general, segmental features can be separated into the trend and location. Since the trend means variation of segmental features and since it occupies a large portion of segmental features, the trend information was used as an independent or additional feature for the speech recognition system. We applied the trend information of segmental features to TA and used posterior probabilities, which are the output of the neural network, as inputs of the recognition system. Experiments were performed on Aurora2 database to examine the potentiality of the trend feature based TA. The results of our experiments verified that the proposed system outperforms the conventional system on very low SNR environments. These findings led us to conclude that the trend information on TA can be additionally used for the traditional MFCC features.

He was a visiting researcher at ETRI from May 2006 to Feb. 2007.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Deng, L.: A generalized hidden Markov model with state-conditioned trend functions of time for the speech signal. Signal Processing 27, 65–78 (1992)

    Article  MATH  Google Scholar 

  2. Gish, H., Ng, K.: A segmental speech model with application to word spotting. In: Proc. of Int. Conf. on Acoustics, Speech and Signal Proc. vol. II, pp. 447–450 (1993)

    Google Scholar 

  3. Gales, M.J.F., Young, S.J.: Segmental Hidden Markov Models. In: Proc. of European Conf. on Speech Comm. and Tech., pp. 1579–1582 (1993)

    Google Scholar 

  4. Gish, H., Ng, K.: Parametric trajectory models for speech recognition. In: Proc. of Int. Conf. on Spoken Lang. Proc. vol. I, pp. 466–469 (1996)

    Google Scholar 

  5. Ostendorf, M., Digalakis, V., Kimball, O.A.: From HMMs to Segment Models: A Unified View of Stochastic Modeling for Speech Recognition. IEEE Tr. on Speech and Audio Processing 4(5), 360–378 (1996)

    Article  Google Scholar 

  6. Holmes, W.J., Russell, M.J.: Probabilistic trajectory segmental HMMs. Computer Speech and Language 13, 3–37 (1999)

    Article  Google Scholar 

  7. Yun, Y.S., Oh, Y.H.: A Segmental-Feature HMM for Speech Pattern Modeling. IEEE Signal Processing Letters 7(6), 135–137 (2000)

    Article  Google Scholar 

  8. Yun, Y.S., Oh, Y.H.: A Segmental-Feature HMM for Continuous Speech Recognition Based On a Parametric Trajectory Model. Speech Communication 38(1), 115–130 (2002)

    Article  MATH  Google Scholar 

  9. Yun, Y.S.: Sharing Trend Information of Trajectory in Segmental Feature HMM. In: Proc. of Int. Conf. On Spoken Language Proc., Denver, Colorado, USA, pp. 2641–2644 (2002)

    Google Scholar 

  10. Hermansky, H., Ellis, D., Sharma, S.: Tandem connectionist feature extraction for conventional HMM systems. In: Proc. of Int. Conf. on Acoustics, Speech and Signal Proc., Istanbul, Turkey, pp. 1635–1638 (2000)

    Google Scholar 

  11. Ellis, D.W.P., Singh, R., Sivadas, S.: Tandem acoustic modeling in large-vocabulary recognition. In: Proc. of Int. Conf. on Acoustics, Speech and Signal Proc., Salt Lake City, USA, pp. 517–520 (2001)

    Google Scholar 

  12. Sivadas, S., Hermansky, H.: Generalized Tandem Feature Extraction. In: Proc. of Int. Conf. on Acoustics, Speech and Signal Proc. vol. II, pp. 56–59 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Václav Matoušek Pavel Mautner

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yun, YS., Lee, Y. (2007). Design of Tandem Architecture Using Segmental Trend Features. In: Matoušek, V., Mautner, P. (eds) Text, Speech and Dialogue. TSD 2007. Lecture Notes in Computer Science(), vol 4629. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74628-7_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74628-7_49

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74627-0

  • Online ISBN: 978-3-540-74628-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics