Predictive models for sequence modelling, application to speech and character recognition

Gallinari, P.

doi:10.1007/BFb0054007

P. Gallinari¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1387))

Included in the following conference series:

International School on Neural Networks, Initiated by IIASS and EMFCSC

166 Accesses
1 Citations

Abstract

We have described a series of predictive models which have been developed for capturing some kind of dependency inside non stationary sequences. Although the precise motivations and the inspiration sources for these different models have been multi-fold, they are aimed at the same goal. Other attempts have been developed which we have not described here. An important class of models which uses parametric trajectories is that of Segment Models, a review and a comparison with HMMs may be found in [37]. Up to now, predictive models have not led to better results than classical multi-gaussian HMMs. Most of the time, the experiments reported by the different authors are performed on small sized or limited complexity problems. However, some authors also report excellent performances of some predictive models on different tasks. In the second part of the paper, we have described a non linear predictive HMM, which is based on regressive neural networks. We have presented experiments on two relatively large tasks where the model reaches state of the art performances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Furui S., 1986, Speaker independent isolated word recognition using dynamic features of speech spectrum, IEEE T. ASSP, 34, 52–59.
Google Scholar
Poritz A.B., 1982: linear predictive HMMs and the speech signal, ICASSP, Vol. 2, 1291–1294.
Google Scholar
Wellekens C., 1987, Explicit time correlation in hidden Markov models for speech recognition” ICASSP'87, pp 384–386.
Google Scholar
Brown P.F., 1987, The acoustic modeling problem in automatic speech recognition”, PhD thesis, Carnegie Mellon University.
Google Scholar
Juang B.H., Rabiner L.R., 1985, Mixture autoregressive hidden Markov models for speech signals In IEEE T. ASSP, Vol. 33, Nℴ6, pp 1404–1413, dec.
MathSciNet Google Scholar
Kenny P., Lennig M., Memelstein P., 1990: A linear predictive HMM for Vector-valued Observations with application to speech recognition, IEEE Trans. on Acoustics Speech and Signal Processing, ASSP-38, 2, pp 220–225.
Article Google Scholar
Woodland P.C., 1992, Hidden Markov models using vector linear predictors and discriminative output distributions” ICASSP'92, pp 509–512.
Google Scholar
Tishby N., 1991: on the application of mixture AR HMMs to text-independent speaker recognition, IEEE Trans. on Signal Processing, Vol. 39, Nℴ 3, March 91.
Google Scholar
Kawabata T., 1993: speaker-independent speech recognition using nonlinear predictor codebooks, ICASSP.
Google Scholar
Artières T., Gallinari P., 1995: multi-state predictive neural models for text-independent speaker identification, Eurospeech 95.
Google Scholar
Mellouk A., Gallinari P., 1993:“A discriminative neural prediction system for speech recognition”, ICASSP 93, ppII 553–536.
Google Scholar
Deng L., Hassanein H., Elsmary M., 1994, Analysis of the correlation structure for a neural predictive model with application to speech recognition, Neural Networks, Vol. 7, Nℴ 2, 331–339.
Article Google Scholar
Bianchini M., Frasconi P., Gori M., 1995: learning in multilayered networks used as autoassociators, IEEE Transactions on Neural Networks, vol. 6, no. 2, 512–514.
Article Google Scholar
Artières T., 1995: Approches prédictives neuronales: application à l'identification du locuteur, Thèse de doctorat, Université de Paris Sud (In french).
Google Scholar
Tebelskis J., Waibel A., Petek B., Schmidbauer O., 1991, Continuous speech recognition using linked predictive neural networks, ICASSP 91, pp 61–64.
Google Scholar
Iso K., Watanabe T., 1990: speaker-independent word recognition using a neural prediction model, ICASSP.
Google Scholar
Iso K., Watanabe T., 1991: “ Large vocabulary speech recognition using neural prediction model”, ICASSP 91, pp 57–60.
Google Scholar
Petek B., Waibel A., Tebelskis J., 1992, Integrated and phoneme-function word architecture of hidden control neural networks for continuous speech recognition” In Speech Communication, Special Issue on Eurospeech, Vol. 11, Nℴ2, pp 273–282.
Google Scholar
Levin E., 1993: hidden control neural architecture modeling of non linear time varying systems and its applications, IEEE Trans on NN, vol 4.
Google Scholar
Tsuboka E, Takada Y, Wakita H., 1990: neural predictive hidden Markov model, ICSLP.
Google Scholar
Rabiner L., Juang B.H., 1993, Fundamentals of speech recognition, Prentice Hall.
Google Scholar
Deng L., Aksmanovic M., Sun X., 1994, Speech recognition using hidden markov models with polynomial functions as nonstationary states, IEEE Trans. SAP, 507–520.
Google Scholar
Hattori H., 1992: text independent speaker recognition using neural networks, ICASSP, II 153–156.
Google Scholar
Mellouk A., Gallinari P., 1994 Discriminative training for improved neural prediction system, ICASSP 94, pp 1233–1236.
Google Scholar
Mellouk A., Gallinari P., 1995, Global discrimination for neural predictive systems based on N-Best algorithm” ICASSP'95.
Google Scholar
Rao T.S., The fitting of nonstationnary time series model with time dependent parameters, J. R. S. S. Series B, vol 32, nℴ 2, 312–322.
Google Scholar
Liporace L.A., 1975, Linear estimation of non stationary signals, J. Acoust. Soc. Amer., vol 58, nℴ 6, 1288–1295.
Article Google Scholar
Grenier Y., 1983, Time-dependent ARMA modeling of non stationary signals, IEEE T. ASSP, Vol. 31, Nℴ 4, 899–911.
Google Scholar
Gish H., Ng K., 1993, A segmental speech model with applications to word spotting, ICASSP'93, 11-447-450.
Google Scholar
Deng L., 1993, A stochastic model of speech incorporating hirerarchical non-stationarity, IEEE T. SAP, Vol. 1, Nℴ 4, 471–474.
Google Scholar
Deng L., Rathinavelu C., 1995, A markov model containing state-conditioned second order non-stationarity: application to speech recognition, Comp. Speech and Lang., 9, 63–86.
Article Google Scholar
Garcia-Salcetti, Dorizzi B., Gallinari P., Wimmer Z., 1996, Adaptive discrimination in an HMM based neural predictive system for on-line word recognition, ICPR-96.
Google Scholar
Robinson T., 1991, Several improvements to a recurrent error propagation network phone recognition system”, Tech. Rep. CUED/F-INFENG/TR.82, Cambridge Univ. Eng. Dept, Sept.
Google Scholar
Lee K. F., Hon H-W., 1989, Speaker-independent phone recognition using hidden markov models”, IEEE Trans. ASSP, Vol 37, no 11. 1641–1648.
Article Google Scholar
Manke S., Finke M., Waibel A., 1995, NPen++: a writer independent large vocabulary on line hand-writing recognition system, ICDAR'95, 403–408.
Google Scholar
Schwartz R., Chow Y.L., 1990 The N-Best algorithm: An efficient and exact procedure for finding the N most likely hypotheses” In ICASSP 90, pp 81–84.
Google Scholar
Ostendorf M., Digalakis V., Kimball O.A., 1996, From HHM's to segment models: a unified view of stochastic modelling for speech recognition, IEEE T. SAP, Vol 4, Nℴ 5, 360–378.
Google Scholar

Download references

Author information

Authors and Affiliations

LIP 6 Université Paris 6, 4 Place jussieu, F-75252, Paris cedex 5, France
P. Gallinari

Authors

P. Gallinari
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

C. Lee Giles Marco Gori

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gallinari, P. (1998). Predictive models for sequence modelling, application to speech and character recognition. In: Giles, C.L., Gori, M. (eds) Adaptive Processing of Sequences and Data Structures. NN 1997. Lecture Notes in Computer Science, vol 1387. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0054007

Download citation

DOI: https://doi.org/10.1007/BFb0054007
Published: 25 May 2006
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64341-8
Online ISBN: 978-3-540-69752-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics