Skip to main content

Advertisement

Log in

A nonlinear prediction model for Chinese speech signal based on RBF neural network

  • 1193 - Intelligent Processing of Multimedia Signals
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

A novel method for Chinese speech time series prediction model is proposed. In order to reconstruct the phase space of Chinese speech signal, the delay time and embedding dimension are calculated by C–C method and false nearest neighbor algorithm. The maximum lyapunov exponent and correlation dimension of Chinese speech phoneme are calculated by wolf algorithm and genetic programming algorithm. The numerical results show that there exists nonlinear characteristics in Chinese speech signal. Based on the analysis method of RBF neural network and the nonlinear characteristic parameters such as the delay time and embedding dimension, a nonlinear prediction model is designed. In order to further verify the prediction performance of the designed prediction model, waveform comparison and four evaluation indexes are used. It is shown that compared with the linear prediction model and back propagation neural network nonlinear prediction model, prediction error of the RBF neural network nonlinear prediction model is significantly reduced, and the model has higher prediction accuracy and prediction performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Narayanan SS, Alwan AA (1995) A nonlinear dynamic system analysis of fricative consonants. J Acoust Soc Am 97(4):2511–2524

    Article  Google Scholar 

  2. Kumar K, Mullick SK (1996) Nonlinear dynamical analysis of speech. J Acoust Soc Am 100(1):615–629

    Article  Google Scholar 

  3. Jiang JJ, Zhang Y, Fors CN (2003) Nonlinear dynamics of phonations in excised larynx experiments. J Acoust Soc Am 114(4):2198–2205

    Article  Google Scholar 

  4. Tuller B, Nguyen N, Lancia L, Vallabha GK (2011) Nonlinear Dynamics in Speech Perception. Nonlinear Dynamics in Human Behavior 328:135–150

    Article  MathSciNet  Google Scholar 

  5. Dahmani M, Anber A, Dahmani Z (2019) Speech movements on vocal tract: Fractional nonlinear dynamics. J Inf Optim Sci 40(6):1307–1315

    MathSciNet  Google Scholar 

  6. Chaitra N, Mohan DM, Dutt DN (2013) Nonlinear synamical snalysis of speech signals. Proceedings of international conference on VLSI, Communication, Advanced Devices, Signals & Systems and Networking 258:343–351

    Google Scholar 

  7. Hu S, Zhang Y, Hua Y (2000) Nonlinear dynamic characteristic analysis of speech for Chinese. Acta Acustica 25(4):329–334

    Google Scholar 

  8. Sun Y, Yao H, Zhang X (2015) Feature extraction of emotional speech based on chaotic characteristics. J Tianjin Univ 48(8):681–685

    Google Scholar 

  9. Asoke KD (2018) Nonlinearity in speech signal. Time Domain Representation of Speech Sounds, pp 131–154 .

  10. Hanilçi C (2018) Linear prediction residual features for automatic speaker verification anti-spoofing[J]. Multimedia Tools and Applications 77(13):160

    Article  Google Scholar 

  11. Wang F, Sahli H, Gao J, Jiang D, Verhelst W (2015) Relevance units machine based dimensional and continuous speech emotion prediction. Multimedia Tools and Applications 74(22):9983–10000

    Article  Google Scholar 

  12. Hermassi H, Hamdi M, Rhouma R, Belghith SM (2017) A joint encryption-compression codec for speech signals using the ITU-T G711 standard and chaotic map. Multimedia Tools and Applications 76(1):1177–1200

    Article  Google Scholar 

  13. Handa A, Agarwal R, Kohli N (2020) A multimodel keyword spotting system based on lip movement and speech features. Multimedia Tools and Applications (prepublish).

  14. Thyssen J, Nielsen H, Hansen SD (1994) Non-linear short-term prediction in speech coding. IEEE, Proc. ICASSP94, pp 185–188 .

  15. Lin T, Horne BG, Tiňo P, Giles CL (1996) Learning long-term dependencies in NARX recurrent neural networks. IEEE Trans Neural Networks 7(6):1329–1338

    Article  Google Scholar 

  16. Al-Jumeily D, Hussain AJ, Fergus P, Radi N (2015) Self-organized neural network inspired by the immune algorithm for the prediction of speech signals. Lect Notes Comput Sci 9226(1):654–664

    Article  Google Scholar 

  17. Lin J, Liu Y (2001) Training methods and the performances of RBF neural networks for nonlinear modeling of speech signals. Signal Process 17(4):322–328

    Google Scholar 

  18. Qin A, Huang Z, Gui W (2008) Nonlinear speech predictor using models for chaotic systems. Comput Eng Appl 44(18):141–143

    Google Scholar 

  19. Takens F (1980) Detecting strange attractors in turbulences. Springer Verlag, Berlin New York, pp 366–381

    Google Scholar 

  20. Cao L (1997) Practical method for determining the minimum embedding dimension of a scalar time series. Physica Section D: Nonlinear Phenomena 110(1–2):43–50

    Article  Google Scholar 

  21. Xie X, Zhang W, Yang Z (2002) A dissipative particle swarm optimization, in: Congress on Evolutionary Computation. Proceedings of the 2002 congress on evolutionary computation, 1456–1461

  22. Lin J, Wang Y, Huang Z, Sheng Z (1999) Selection of proper time-delay in phase space reconstruction of speech signals. Signal Process 15(3):220–225

    Google Scholar 

  23. Kennel MB, Brown R, Abarbanel HD (1992) Determining embedding dimension for phase-space reconstruction using a geometrical construction. Physical Review A, Atomic, Molecular, And Optical Physics 45(6):3403–3411

    Article  Google Scholar 

  24. Buzug T, Pfister G (1992) Comparison of algorithms calculating optimal embedding parameters for delay time coordinate. Physica Section D: Nonlinear Phenomena 58(1–4):127–137

    Article  Google Scholar 

  25. Kugiumtzis D (1996) State space reconstruction parameters in the analysis of chaotic time series-the role of the time window length. Physica Section D: Nonlinear Phenomena 95(1):13–28

    Article  Google Scholar 

  26. Kim HS, Eykholt R, Salas JD (1999) Nonlinear dynamics, delay times, and embedding windows. Physica Section D: Nonlinear Phenomena 127(1–2):48–60

    Article  Google Scholar 

  27. Wolf A, Swift JB, Swinney HL, Vastanoa JA (1985) Determining Lyapunov exponents from a time series. Physica Section D: Nonlinear Phenomena 16(3):285–317

    Article  MathSciNet  Google Scholar 

  28. Barna G, Tsuda I (1993) A new method for computing Lyapunov exponents. Phys Lett A 175(6):421–427

    Article  MathSciNet  Google Scholar 

  29. Wang Y, Lin J, Huang P, Sheng Z (2000) Nonlinear analysis and processing of speech signals. Communications Technology 1(108):61–65

    Google Scholar 

  30. Grassberger P, Procaccia I (1983) Measuring the Strangeness of strange Attractors. Physica Section D: Nonlinear Phenomena 9(1):189–208

    Article  MathSciNet  Google Scholar 

  31. Hou L (2005) Speaker recognition based on nonlinear dynamics and information fusion. PhD thesis, College of Communication and Information Engineering. Shanghai University, Shanghai

  32. Kokkinos I, Maragos P (2005) Nonlinear speech analysis using models for chaotic systems. IEEE Transaction on Speech and Audio Processing 13(6):1098–1109

    Article  Google Scholar 

  33. Lei Y, Jun Z, Xiao W, Yu Z, Jing L (2016) A chaotic time series prediction model for speech signal encoding based on genetic programming. Appl Soft Comput 38:754–761

    Article  Google Scholar 

Download references

Acknowledgements

This work reported in this paper was supported by the National Natural Science Foundation of China (NSFC) under Grant 11847163, in part by the Gansu education department project under Grant 2021B-27 and the Qingyang science and technology planning project under Grant QY2021A-G004. The author thanks the referees for their valuable suggestions and comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaohong Gao.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gao, X. A nonlinear prediction model for Chinese speech signal based on RBF neural network. Multimed Tools Appl 81, 5033–5049 (2022). https://doi.org/10.1007/s11042-021-11612-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-11612-6

Keywords

Navigation