Skip to main content

Predictive models for sequence modelling, application to speech and character recognition

  • Chapter
  • First Online:
Adaptive Processing of Sequences and Data Structures (NN 1997)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1387))

Included in the following conference series:

Abstract

We have described a series of predictive models which have been developed for capturing some kind of dependency inside non stationary sequences. Although the precise motivations and the inspiration sources for these different models have been multi-fold, they are aimed at the same goal. Other attempts have been developed which we have not described here. An important class of models which uses parametric trajectories is that of Segment Models, a review and a comparison with HMMs may be found in [37]. Up to now, predictive models have not led to better results than classical multi-gaussian HMMs. Most of the time, the experiments reported by the different authors are performed on small sized or limited complexity problems. However, some authors also report excellent performances of some predictive models on different tasks. In the second part of the paper, we have described a non linear predictive HMM, which is based on regressive neural networks. We have presented experiments on two relatively large tasks where the model reaches state of the art performances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Furui S., 1986, Speaker independent isolated word recognition using dynamic features of speech spectrum, IEEE T. ASSP, 34, 52–59.

    Google Scholar 

  2. Poritz A.B., 1982: linear predictive HMMs and the speech signal, ICASSP, Vol. 2, 1291–1294.

    Google Scholar 

  3. Wellekens C., 1987, Explicit time correlation in hidden Markov models for speech recognition” ICASSP'87, pp 384–386.

    Google Scholar 

  4. Brown P.F., 1987, The acoustic modeling problem in automatic speech recognition”, PhD thesis, Carnegie Mellon University.

    Google Scholar 

  5. Juang B.H., Rabiner L.R., 1985, Mixture autoregressive hidden Markov models for speech signals In IEEE T. ASSP, Vol. 33, Nℴ6, pp 1404–1413, dec.

    MathSciNet  Google Scholar 

  6. Kenny P., Lennig M., Memelstein P., 1990: A linear predictive HMM for Vector-valued Observations with application to speech recognition, IEEE Trans. on Acoustics Speech and Signal Processing, ASSP-38, 2, pp 220–225.

    Article  Google Scholar 

  7. Woodland P.C., 1992, Hidden Markov models using vector linear predictors and discriminative output distributions” ICASSP'92, pp 509–512.

    Google Scholar 

  8. Tishby N., 1991: on the application of mixture AR HMMs to text-independent speaker recognition, IEEE Trans. on Signal Processing, Vol. 39, Nℴ 3, March 91.

    Google Scholar 

  9. Kawabata T., 1993: speaker-independent speech recognition using nonlinear predictor codebooks, ICASSP.

    Google Scholar 

  10. Artières T., Gallinari P., 1995: multi-state predictive neural models for text-independent speaker identification, Eurospeech 95.

    Google Scholar 

  11. Mellouk A., Gallinari P., 1993:“A discriminative neural prediction system for speech recognition”, ICASSP 93, ppII 553–536.

    Google Scholar 

  12. Deng L., Hassanein H., Elsmary M., 1994, Analysis of the correlation structure for a neural predictive model with application to speech recognition, Neural Networks, Vol. 7, Nℴ 2, 331–339.

    Article  Google Scholar 

  13. Bianchini M., Frasconi P., Gori M., 1995: learning in multilayered networks used as autoassociators, IEEE Transactions on Neural Networks, vol. 6, no. 2, 512–514.

    Article  Google Scholar 

  14. Artières T., 1995: Approches prédictives neuronales: application à l'identification du locuteur, Thèse de doctorat, Université de Paris Sud (In french).

    Google Scholar 

  15. Tebelskis J., Waibel A., Petek B., Schmidbauer O., 1991, Continuous speech recognition using linked predictive neural networks, ICASSP 91, pp 61–64.

    Google Scholar 

  16. Iso K., Watanabe T., 1990: speaker-independent word recognition using a neural prediction model, ICASSP.

    Google Scholar 

  17. Iso K., Watanabe T., 1991: “ Large vocabulary speech recognition using neural prediction model”, ICASSP 91, pp 57–60.

    Google Scholar 

  18. Petek B., Waibel A., Tebelskis J., 1992, Integrated and phoneme-function word architecture of hidden control neural networks for continuous speech recognition” In Speech Communication, Special Issue on Eurospeech, Vol. 11, Nℴ2, pp 273–282.

    Google Scholar 

  19. Levin E., 1993: hidden control neural architecture modeling of non linear time varying systems and its applications, IEEE Trans on NN, vol 4.

    Google Scholar 

  20. Tsuboka E, Takada Y, Wakita H., 1990: neural predictive hidden Markov model, ICSLP.

    Google Scholar 

  21. Rabiner L., Juang B.H., 1993, Fundamentals of speech recognition, Prentice Hall.

    Google Scholar 

  22. Deng L., Aksmanovic M., Sun X., 1994, Speech recognition using hidden markov models with polynomial functions as nonstationary states, IEEE Trans. SAP, 507–520.

    Google Scholar 

  23. Hattori H., 1992: text independent speaker recognition using neural networks, ICASSP, II 153–156.

    Google Scholar 

  24. Mellouk A., Gallinari P., 1994 Discriminative training for improved neural prediction system, ICASSP 94, pp 1233–1236.

    Google Scholar 

  25. Mellouk A., Gallinari P., 1995, Global discrimination for neural predictive systems based on N-Best algorithm” ICASSP'95.

    Google Scholar 

  26. Rao T.S., The fitting of nonstationnary time series model with time dependent parameters, J. R. S. S. Series B, vol 32, nℴ 2, 312–322.

    Google Scholar 

  27. Liporace L.A., 1975, Linear estimation of non stationary signals, J. Acoust. Soc. Amer., vol 58, nℴ 6, 1288–1295.

    Article  Google Scholar 

  28. Grenier Y., 1983, Time-dependent ARMA modeling of non stationary signals, IEEE T. ASSP, Vol. 31, Nℴ 4, 899–911.

    Google Scholar 

  29. Gish H., Ng K., 1993, A segmental speech model with applications to word spotting, ICASSP'93, 11-447-450.

    Google Scholar 

  30. Deng L., 1993, A stochastic model of speech incorporating hirerarchical non-stationarity, IEEE T. SAP, Vol. 1, Nℴ 4, 471–474.

    Google Scholar 

  31. Deng L., Rathinavelu C., 1995, A markov model containing state-conditioned second order non-stationarity: application to speech recognition, Comp. Speech and Lang., 9, 63–86.

    Article  Google Scholar 

  32. Garcia-Salcetti, Dorizzi B., Gallinari P., Wimmer Z., 1996, Adaptive discrimination in an HMM based neural predictive system for on-line word recognition, ICPR-96.

    Google Scholar 

  33. Robinson T., 1991, Several improvements to a recurrent error propagation network phone recognition system”, Tech. Rep. CUED/F-INFENG/TR.82, Cambridge Univ. Eng. Dept, Sept.

    Google Scholar 

  34. Lee K. F., Hon H-W., 1989, Speaker-independent phone recognition using hidden markov models”, IEEE Trans. ASSP, Vol 37, no 11. 1641–1648.

    Article  Google Scholar 

  35. Manke S., Finke M., Waibel A., 1995, NPen++: a writer independent large vocabulary on line hand-writing recognition system, ICDAR'95, 403–408.

    Google Scholar 

  36. Schwartz R., Chow Y.L., 1990 The N-Best algorithm: An efficient and exact procedure for finding the N most likely hypotheses” In ICASSP 90, pp 81–84.

    Google Scholar 

  37. Ostendorf M., Digalakis V., Kimball O.A., 1996, From HHM's to segment models: a unified view of stochastic modelling for speech recognition, IEEE T. SAP, Vol 4, Nℴ 5, 360–378.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

C. Lee Giles Marco Gori

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Gallinari, P. (1998). Predictive models for sequence modelling, application to speech and character recognition. In: Giles, C.L., Gori, M. (eds) Adaptive Processing of Sequences and Data Structures. NN 1997. Lecture Notes in Computer Science, vol 1387. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0054007

Download citation

  • DOI: https://doi.org/10.1007/BFb0054007

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64341-8

  • Online ISBN: 978-3-540-69752-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics