Skip to main content

A Discriminative Segmental Speech Model and Its Application to Hungarian Number Recognition

  • Conference paper
  • First Online:
Text, Speech and Dialogue (TSD 2000)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1902))

Included in the following conference series:

  • 381 Accesses

Abstract

This paper presents a stochastic segmental speech recogniser that models the a posteriori probabilities directly. The main issues concerning the system are segmental phoneme classification, utterance-level aggregation and the pruning of the search space. For phoneme classification, artificial neural networks and support vector machines are applied. Phonemic segmentation and utterancelevel aggregation is performed with the aid of anti-phoneme modelling. At the phoneme level, the system convincingly outperforms the HMM system trained on the same corpus, while at the word level it attains the performance of the HMM system trained without embedded training.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Fukada, T., Sagisaka, Y. and Paliwal, K. K., Model Parameter Estimation for Mixture Density Polynomial Segment Models, Proc. of ICASSP’ 97, pp. 1403–1406, Munich, Germany, 1997.

    Google Scholar 

  2. Fukunaga, K., Statistical Pattern Recognition, New York: Academic Press, 1989.

    Google Scholar 

  3. Halberstadt, A. K., Heterogeneous Measurements and Multiple Classifiers for Speech Recognition, Ph.D. Thesis, Dep. Electrical Engineering and Computer Science, MIT, 1998.

    Google Scholar 

  4. Kocsor, A., Tóth, L., Kuba, A. Jr., Kovács, K., Jelasity, M., Gyimóthy, T. and Csirik, J., A Comparative Study of Several Feature Transformation and Learning Methods for Phoneme Classification, accepted for publication in the International Journal of Speech Technology.

    Google Scholar 

  5. Mariani, J., Gauvain, J. L., Lamel, L., Comments on “Towards increasing speech recognition error rates” by H. Bourlard, H. Hermansky, and N. Morgan, Speech Communication, 18 (1996), pp. 249–252.

    Article  Google Scholar 

  6. Morgan, N., Bourlard, H., Greenberg, S., Hermansky, H., Stochastic Perceptual Auditory-Event-Based Models for Speech Recognition, Proc. of ICSLP’ 94, pp. 1943–1946, 1994.

    Google Scholar 

  7. Richard, M. D. and Lippmann, R. P., Neural network classifiers estimate Bayesian a posteriori probabilities, Neural Computation, 3(4):461:483, 1991.

    Article  Google Scholar 

  8. Scholkopf, B., Smola, A. and Müller, K.-R., Nonlinear Component Analysis as a Kernel Eigenvalue Problem, Neural Computation, Vol. 10(5), 1998.

    Google Scholar 

  9. Szarvas, M., Mihajlik, P., Fegyó, T. and Tatai, P., Automatic Recognition of Hungarian: Theory and Practice, accepted for publication in the International Journal of Speech Technology.

    Google Scholar 

  10. Vapnik, V. N., Statistical Learning Theory, John Wiley & Sons Inc., 1998.

    Google Scholar 

  11. Zavaliagkos, G., Zhao, J., Schwartz, R. and Makhoul, J., A Hybrid Segmental Neural Net/Hidden Markov Model System for Continuous Speech Recognition, IEEE Trans. Speech and Audio Proc., Vol. 2, No. 1, Part II, January 1994.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tóth, L., Kocsor, A., Kovács, K. (2000). A Discriminative Segmental Speech Model and Its Application to Hungarian Number Recognition. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2000. Lecture Notes in Computer Science(), vol 1902. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45323-7_52

Download citation

  • DOI: https://doi.org/10.1007/3-540-45323-7_52

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41042-3

  • Online ISBN: 978-3-540-45323-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics