Abstract
Speech signals have statistically nonstationary properties and cannot be processed properly by means of classical linear parametric models (AR, MA, ARMA). The neural network approach to time series prediction is suitable for learning and recognizing the nonlinear nature of the speech signal. We present a neural implementation of the NARMA model (nonlinear ARMA) and test it on a class of speech signals, spoken by both men and women in different dialects of the English language. The Akaike’s information criterion is proposed for the selection of the parameters of the NARMA model.



Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Brockwell PJ, Davis RA (1987) Time series: theory and methods. Springer, New York
Mandic DP, Chambers JA (2001) Recurrent neural networks for prediction –learning algorithms, architectures and stability. John Wiley & Sons
Singhal S, Wu L (1989) Training multilayer perceptrons with the extended Kalman algorithm. Advances in Neural Information Processing Systems: 133–140
Catlin D (1989) Estimation, Control and the Discrete Kalman Filter. Springer Verlag
Welch G, Bishop G (2001) An introduction to the Kalman filter. http://www.cs.unc.edu/~welch/kalman/
Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F (eds) Proceedings of 2nd international symposium on information theory. Akademiai Kiado, Budapest, pp 267–281
Hamilton JD (1994) Time Series Analysis. Princeton University Press
Young S, Evermann G (2001) The HTK Book. Cambridge University Engineering Dept
Acknowledgments
The author would like to thank Professor Monica Dumitrescu and Professor Ion Văduva for their advice and support.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Proof of Proposition 1
We assume that \( \hat{w}\left( {k|k - 1} \right) \) and \( P\left( {k|k - 1} \right) \) have been calculated. From the Theorem 1, it follows that
But E(q(k)z(k)T) = 0 because in the definition of the Kalman filter we assumed that \( E\left( {q(k)w(j)^{T} } \right) = 0\quad for\,j \le k, \) \( E\left( {r(k)q(j)} \right) = 0\quad \forall \,j,\,k \) and \( E\left( {q(j)} \right) = 0,\,\forall \,j. \)
It results that
Because \( t\left( k \right) = h\left( {\hat{w}\left( {k|k - 1} \right)} \right) - H\left( k \right)\hat{w}\left( {k|k - 1} \right) \) depends only on z(k−1) = [s(1),…, s(k-1), t(1),…, t(k−1)]T, we can assume that the best linear minimum variance estimator of w(k) based on [s(1),…, s(k−1), t(1),…t(k)]T, \( \hat{w}\left( {k|k - 1,\,t(k)} \right) \) can be approximated with \( \hat{w}\left( {k|k - 1} \right) \). Applying the Theorem 2, we obtain
where we denoted \( K\left( k \right) = P\left( {k|k - 1} \right)H\left( k \right)^{T} \left[ {C + H\left( k \right)P\left( {k|k - 1} \right)H\left( k \right)^{T} } \right]^{ + } . \)
Because \( C + H\left( k \right)P\left( {k|k - 1} \right)H\left( k \right)^{T} \in \Re \) we have
Finally, from (6) to (11), the conclusion of the Proposition 1 results. □
Proof of Proposition 2
From Theorem 1, we have
As we assumed that \( E\left( {r(k)q(j)} \right) = 0\quad \forall \,j,\,k \), it follows that
and applying Theorem 1 again, we obtain
where we denote by \( \hat{t}\left( {k + 1|k} \right) \) the best linear minimum variance estimator of t(k + 1) based on z(k) = [s(1),…, s(k), t(1),…, t(k)]T.
It follows that
because by definition \( t\left( {k + 1} \right) = h\left( {\hat{w}\left( {k + 1|k} \right)} \right) - H\left( {k + 1} \right)\hat{w}\left( {k + 1|k} \right) \) depends only on z(k) = [s(1),…, s(k), t(1),…, t(k)]T, so we can assume that \( \hat{t}\left( {k + 1|k} \right) \approx t\left( {k + 1} \right). \)
But
So
because we assumed in the filter definition that
and r ~ N(0, C). □
Rights and permissions
About this article
Cite this article
Cidotã, MA. Choosing the parameters of the NARMA model implemented with the recurrent perceptron for speech prediction. Neural Comput & Applic 19, 903–910 (2010). https://doi.org/10.1007/s00521-010-0375-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-010-0375-7