A nonlinear prediction model for Chinese speech signal based on RBF neural network

Gao, Xiaohong

doi:10.1007/s11042-021-11612-6

A nonlinear prediction model for Chinese speech signal based on RBF neural network

1193 - Intelligent Processing of Multimedia Signals
Published: 08 January 2022

Volume 81, pages 5033–5049, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Xiaohong Gao¹

231 Accesses
3 Citations
Explore all metrics

Abstract

A novel method for Chinese speech time series prediction model is proposed. In order to reconstruct the phase space of Chinese speech signal, the delay time and embedding dimension are calculated by C–C method and false nearest neighbor algorithm. The maximum lyapunov exponent and correlation dimension of Chinese speech phoneme are calculated by wolf algorithm and genetic programming algorithm. The numerical results show that there exists nonlinear characteristics in Chinese speech signal. Based on the analysis method of RBF neural network and the nonlinear characteristic parameters such as the delay time and embedding dimension, a nonlinear prediction model is designed. In order to further verify the prediction performance of the designed prediction model, waveform comparison and four evaluation indexes are used. It is shown that compared with the linear prediction model and back propagation neural network nonlinear prediction model, prediction error of the RBF neural network nonlinear prediction model is significantly reduced, and the model has higher prediction accuracy and prediction performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Noise-Robust Speech Recognition Based on LPMCC Feature and RBF Neural Network

Prediction of Chaotic Time Series of RBF Neural Network Based on Particle Swarm Optimization

A bio-inspired feature extraction for robust speech recognition

Article Open access 04 November 2014

References

Narayanan SS, Alwan AA (1995) A nonlinear dynamic system analysis of fricative consonants. J Acoust Soc Am 97(4):2511–2524
Article Google Scholar
Kumar K, Mullick SK (1996) Nonlinear dynamical analysis of speech. J Acoust Soc Am 100(1):615–629
Article Google Scholar
Jiang JJ, Zhang Y, Fors CN (2003) Nonlinear dynamics of phonations in excised larynx experiments. J Acoust Soc Am 114(4):2198–2205
Article Google Scholar
Tuller B, Nguyen N, Lancia L, Vallabha GK (2011) Nonlinear Dynamics in Speech Perception. Nonlinear Dynamics in Human Behavior 328:135–150
Article MathSciNet Google Scholar
Dahmani M, Anber A, Dahmani Z (2019) Speech movements on vocal tract: Fractional nonlinear dynamics. J Inf Optim Sci 40(6):1307–1315
MathSciNet Google Scholar
Chaitra N, Mohan DM, Dutt DN (2013) Nonlinear synamical snalysis of speech signals. Proceedings of international conference on VLSI, Communication, Advanced Devices, Signals & Systems and Networking 258:343–351
Google Scholar
Hu S, Zhang Y, Hua Y (2000) Nonlinear dynamic characteristic analysis of speech for Chinese. Acta Acustica 25(4):329–334
Google Scholar
Sun Y, Yao H, Zhang X (2015) Feature extraction of emotional speech based on chaotic characteristics. J Tianjin Univ 48(8):681–685
Google Scholar
Asoke KD (2018) Nonlinearity in speech signal. Time Domain Representation of Speech Sounds, pp 131–154 .
Hanilçi C (2018) Linear prediction residual features for automatic speaker verification anti-spoofing[J]. Multimedia Tools and Applications 77(13):160
Article Google Scholar
Wang F, Sahli H, Gao J, Jiang D, Verhelst W (2015) Relevance units machine based dimensional and continuous speech emotion prediction. Multimedia Tools and Applications 74(22):9983–10000
Article Google Scholar
Hermassi H, Hamdi M, Rhouma R, Belghith SM (2017) A joint encryption-compression codec for speech signals using the ITU-T G711 standard and chaotic map. Multimedia Tools and Applications 76(1):1177–1200
Article Google Scholar
Handa A, Agarwal R, Kohli N (2020) A multimodel keyword spotting system based on lip movement and speech features. Multimedia Tools and Applications (prepublish).
Thyssen J, Nielsen H, Hansen SD (1994) Non-linear short-term prediction in speech coding. IEEE, Proc. ICASSP94, pp 185–188 .
Lin T, Horne BG, Tiňo P, Giles CL (1996) Learning long-term dependencies in NARX recurrent neural networks. IEEE Trans Neural Networks 7(6):1329–1338
Article Google Scholar
Al-Jumeily D, Hussain AJ, Fergus P, Radi N (2015) Self-organized neural network inspired by the immune algorithm for the prediction of speech signals. Lect Notes Comput Sci 9226(1):654–664
Article Google Scholar
Lin J, Liu Y (2001) Training methods and the performances of RBF neural networks for nonlinear modeling of speech signals. Signal Process 17(4):322–328
Google Scholar
Qin A, Huang Z, Gui W (2008) Nonlinear speech predictor using models for chaotic systems. Comput Eng Appl 44(18):141–143
Google Scholar
Takens F (1980) Detecting strange attractors in turbulences. Springer Verlag, Berlin New York, pp 366–381
Google Scholar
Cao L (1997) Practical method for determining the minimum embedding dimension of a scalar time series. Physica Section D: Nonlinear Phenomena 110(1–2):43–50
Article Google Scholar
Xie X, Zhang W, Yang Z (2002) A dissipative particle swarm optimization, in: Congress on Evolutionary Computation. Proceedings of the 2002 congress on evolutionary computation, 1456–1461
Lin J, Wang Y, Huang Z, Sheng Z (1999) Selection of proper time-delay in phase space reconstruction of speech signals. Signal Process 15(3):220–225
Google Scholar
Kennel MB, Brown R, Abarbanel HD (1992) Determining embedding dimension for phase-space reconstruction using a geometrical construction. Physical Review A, Atomic, Molecular, And Optical Physics 45(6):3403–3411
Article Google Scholar
Buzug T, Pfister G (1992) Comparison of algorithms calculating optimal embedding parameters for delay time coordinate. Physica Section D: Nonlinear Phenomena 58(1–4):127–137
Article Google Scholar
Kugiumtzis D (1996) State space reconstruction parameters in the analysis of chaotic time series-the role of the time window length. Physica Section D: Nonlinear Phenomena 95(1):13–28
Article Google Scholar
Kim HS, Eykholt R, Salas JD (1999) Nonlinear dynamics, delay times, and embedding windows. Physica Section D: Nonlinear Phenomena 127(1–2):48–60
Article Google Scholar
Wolf A, Swift JB, Swinney HL, Vastanoa JA (1985) Determining Lyapunov exponents from a time series. Physica Section D: Nonlinear Phenomena 16(3):285–317
Article MathSciNet Google Scholar
Barna G, Tsuda I (1993) A new method for computing Lyapunov exponents. Phys Lett A 175(6):421–427
Article MathSciNet Google Scholar
Wang Y, Lin J, Huang P, Sheng Z (2000) Nonlinear analysis and processing of speech signals. Communications Technology 1(108):61–65
Google Scholar
Grassberger P, Procaccia I (1983) Measuring the Strangeness of strange Attractors. Physica Section D: Nonlinear Phenomena 9(1):189–208
Article MathSciNet Google Scholar
Hou L (2005) Speaker recognition based on nonlinear dynamics and information fusion. PhD thesis, College of Communication and Information Engineering. Shanghai University, Shanghai
Kokkinos I, Maragos P (2005) Nonlinear speech analysis using models for chaotic systems. IEEE Transaction on Speech and Audio Processing 13(6):1098–1109
Article Google Scholar
Lei Y, Jun Z, Xiao W, Yu Z, Jing L (2016) A chaotic time series prediction model for speech signal encoding based on genetic programming. Appl Soft Comput 38:754–761
Article Google Scholar

Download references

Acknowledgements

This work reported in this paper was supported by the National Natural Science Foundation of China (NSFC) under Grant 11847163, in part by the Gansu education department project under Grant 2021B-27 and the Qingyang science and technology planning project under Grant QY2021A-G004. The author thanks the referees for their valuable suggestions and comments.

Author information

Authors and Affiliations

School of Electrical Engineering, Longdong University, Qingyang, Gansu, China
Xiaohong Gao

Authors

Xiaohong Gao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaohong Gao.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gao, X. A nonlinear prediction model for Chinese speech signal based on RBF neural network. Multimed Tools Appl 81, 5033–5049 (2022). https://doi.org/10.1007/s11042-021-11612-6

Download citation

Received: 06 August 2020
Revised: 22 July 2021
Accepted: 22 September 2021
Published: 08 January 2022
Issue Date: February 2022
DOI: https://doi.org/10.1007/s11042-021-11612-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A nonlinear prediction model for Chinese speech signal based on RBF neural network

Abstract

Access this article

Similar content being viewed by others

Noise-Robust Speech Recognition Based on LPMCC Feature and RBF Neural Network

Prediction of Chaotic Time Series of RBF Neural Network Based on Particle Swarm Optimization

A bio-inspired feature extraction for robust speech recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A nonlinear prediction model for Chinese speech signal based on RBF neural network

Abstract

Access this article

Similar content being viewed by others

Noise-Robust Speech Recognition Based on LPMCC Feature and RBF Neural Network

Prediction of Chaotic Time Series of RBF Neural Network Based on Particle Swarm Optimization

A bio-inspired feature extraction for robust speech recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation