Skip to main content

Application of EαNets to Feature Recognition of Articulation Manner in Knowledge-Based Automatic Speech Recognition

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3931))

Abstract

Speech recognition has become common in many application domains. Incorporating acoustic-phonetic knowledge into Automatic Speech Recognition (ASR) systems design has been proven a viable approach to rise ASR accuracy. Manner of articulation attributes such as vowel, stop, fricative, approximant, nasal, and silence are examples of such knowledge. Neural networks have already been used successfully as detectors for manner of articulation attributes starting from representations of speech signal frames. In this paper, a set of six detectors for the above mentioned attributes is designed based on the E-αNet model of neural networks. This model was chosen for its capability to learn hidden activation functions that results in better generalization properties. Experimental set-up and results are presented that show an average 3.5% improvement over a baseline neural network implementation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)

    Article  Google Scholar 

  2. Haykin, S.: Neural Networks: a Comprehensive Foundation, 2nd edn. Prentice-Hall, Englewood Cliffs (1998)

    MATH  Google Scholar 

  3. Kirchhoff, K.: Combining articulatory and acoustic information for speech recognition in noisy and reverberant environments. In: Proc. ICSLP 1998, Sydney, Australia (1998)

    Google Scholar 

  4. Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllable word recognition in continuously spoken sentences. IEEE Trans. On Acoust., Speech and Signal Process. 28(4), 357–366 (1980)

    Article  Google Scholar 

  5. Lee, K.F., Hon, H.W.: Speaker-independent phone recognition using hiddenMarkov models. IEEE Trans. On Acoust., Speech and Signal Process. 37(11), 1641–1648 (1989)

    Article  Google Scholar 

  6. Li, J., Tsao, Y., Lee, C.-H.: A study on knowledge source integration for candidate rescoring in automatic speech recognition. In: Proceedings of the International Conference on Spoken Language Processing, Sydney, Australia, pp. 891–894 (December 1998)

    Google Scholar 

  7. Lippmann, R.P.: Speech recognition by machines and humans. Speech Communication 22(1), 1–15 (1997)

    Article  Google Scholar 

  8. Gaglio, S., Pilato, G., Sorbello, F., Vassallo, G.: Using the Hermite Regression Formula to Design a Neural Architecture with Automatic Learning of the Hidden Activation Functions. In: Lamma, E., Mello, P. (eds.) AI*IA 1999. LNCS (LNAI), vol. 1792, pp. 226–237. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  9. Pilato, G., Sorbello, F., Vassallo, G.: An Innovative Way to Measure the Quality of a Neural Network without the Use of the Test Set. IJACI International Journal of Advanced Computational Intelligence 5(1), 31–36 (2001)

    Article  Google Scholar 

  10. Cirasa, A., Pilato, G., Sorbello, F., Vassallo, G.: An Enhanced Version of the aNet Architecture: Automatic Pruning of the Hermite Orthonormal Functions. In: Atti del Workshop Apprendimento e Percezione nei Sistemi Robotici, Parma, Italy, pp. 29–30 (1999)

    Google Scholar 

  11. Lee, C.-H.: From knowledge-ignorant to knowledge-rich modeling: a new speech research paradigm for next generation automatic speech recognition. In: Proc. ICSLP (2004)

    Google Scholar 

  12. Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., Dahlgren, N.L.: DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus. U.S. Dept. of Commerce, NIST, Gaithersburg, MD (February 1993)

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Siniscalchi, S.M. et al. (2006). Application of EαNets to Feature Recognition of Articulation Manner in Knowledge-Based Automatic Speech Recognition. In: Apolloni, B., Marinaro, M., Nicosia, G., Tagliaferri, R. (eds) Neural Nets. WIRN NAIS 2005 2005. Lecture Notes in Computer Science, vol 3931. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11731177_21

Download citation

  • DOI: https://doi.org/10.1007/11731177_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33183-4

  • Online ISBN: 978-3-540-33184-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics