Efficient MLP constructive training algorithm using a neuron recruiting approach for isolated word recognition system

Masmoudi, Sabeur; Frikha, Mondher; Chtourou, Mohamed; Hamida, Ahmed Ben

doi:10.1007/s10772-010-9082-0

Efficient MLP constructive training algorithm using a neuron recruiting approach for isolated word recognition system

Published: 27 November 2010

Volume 14, pages 1–10, (2011)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Sabeur Masmoudi¹,
Mondher Frikha¹,
Mohamed Chtourou² &
…
Ahmed Ben Hamida¹

138 Accesses
7 Citations
Explore all metrics

Abstract

This paper describes an efficient constructive training algorithm using a Multi Layer Perceptron (MLP) neural network dedicated for Isolated Word Recognition (IWR) systems. Incremental training procedure was employed and this approach was based on novel hidden neurons recruiting for a single hidden-layer. During Neural Network (NN) training phase, the number of pronunciation samples extracted from the Training Data (TD) was sequentially increased. Optimal structure of the NN classifier with optimized TD size was obtained using this proposed MLP constructive training algorithm.

Isolated word recognition system based on MLP neural network was then constructed and tested for recognizing ten words extracted from TIMIT database. Mel Frequency Cepstral Coefficient (MFCC) feature extraction method was employed including energy, first and second derivative coefficients.

A proposed Frame-by-Frame Neural Network (FFNN) classification method was explored and compared with the Conventional Neural Network (CNN) classification approach. Principal Component Analysis (PCA) technique was also investigated in order to reduce both TD size as well as recognition system complexity.

Experimental results showed superior performance of the proposed FFNN classifier compared to the CNN counter part which was illustrated by the significant improvement obtained in terms of recognition rate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Isolated Word Recognition Using Enhanced MFCC and IIFs

Isolated Word Recognition Based on Different Statistical Analysis and Feature Selection Technique

Quantitative Analysis of Feature Extraction Techniques for Isolated Word Recognition

References

Bourlard, H. A., & Morgan, N. (1998). Hybrid HMM/ANN systems for speech recognition: Overview and new research directions. In Lecture notes comput. sci. (Vol. 1387, pp. 389–417).
Google Scholar
Davis, S., & Mermelstein, P. (1980). Comparison of parametric representations of monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-28, 357–366.
Article Google Scholar
Furui, S. (1986). Speaker independent isolated word recognition using dynamic features of speech spectrum. IEEE Transactions on Acoustics, and Speech Signal Processing, 34(1), 52–59.
Article Google Scholar
Gandhiraj, R., & Sathidevi, P. S. (2007). Auditory-based wavelet packet filterbank for speech recognition using neural network. In Proceedings of the 15th international conference on advanced computing and communications, Dec. 18–21, Guwahati, India (pp. 666–673).
Chapter Google Scholar
Hermansky, H. (1990). Perceptual linear predictive (PLP) analysis of speech. Journal of the Acoustical Society of America, 87, 1738–1752.
Article Google Scholar
Hermansky, H. (1997). The modulation spectrum in the automatic recognition of speech. In Proceedings of the IEEE workshop on automatic speech recognition and understanding, Dec. 14–17, Santa Barbara, CA (pp. 140–147).
Chapter Google Scholar
Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feed forward networks are universal approximators. Neural Networks, 2, 359–366.
Article Google Scholar
Juang, C. F., Chiou, C. T., & Lai, C. L. (2007). Hierarchical singleton-type recurrent neural fuzzy networks for noisy speech recognition. IEEE Transactions on Neural Networks, 18, 833–843.
Article Google Scholar
Kandali, A. B., Routray, A., & Basu, T. K. (2009). Vocal emotion recognition in five native languages of Assam using new wavelet features. International Journal of Speech Technology, 12, 1–13.
Article Google Scholar
Kuang, Z., & Kuh, A. (1992). A combined self-organizing feature map and multilayer perceptron for isolated word recognition. IEEE Transactions on Signal Processing, 40(11), 2651–2657.
Article Google Scholar
Lee, L. M., & Wang, H. C. (1994). A study on adaptations of cepstral and delta cepstral coefficients for noisy speech recognition. In Proc. int. conf. on spoken language processing, Yokohama, Japan (Vol. 3, pp. 1011–1014).
Google Scholar
Lee, T., Ching, P. C., & Chan, L. W. (1998). Isolated word recognition using modular recurrent neural networks. Pattern Recognition, 31, 751–760.
Article Google Scholar
Levin, E. (1990). Word recognition using hidden control neural network architecture. In Proceedings of the IEEE international conference acoustics, speech, and signal processing (ICASSP’90), Apr. 3–6, Albuquerque, NM (pp. 433–436).
Google Scholar
Li, Y. X., Kwong, S., He, Q. H., He, J., & Yang, J. C. (2010). Genetic algorithm based simultaneous optimization of feature subsets and hidden Markov model parameters for discrimination between speech and non-speech events. International Journal of Speech Technology, 13, 61–73.
Article Google Scholar
Liang, Q., & Harris, J. G. (2003). The feature of artificial neural networks and speech recognition. In C. T. Leondes (Ed.), Intelligent systems: technology and applications signal, image, and speech processing (Vol. 3, pp. 215–236). Boca Raton: CRC Press.
Google Scholar
Lim, C. P., Woo, S. C., Loh, A. S., & Osman, R. (2000). Speech recognition using artificial neural networks. In Proceeding of the first international conference on web information systems engineering, Jun 19–21, Hong Kong (Vol. 1, pp. 419–423).
Chapter Google Scholar
Lippmann, R. P. (1989). Pattern classification using neural networks. IEEE Communications Magazine, 27, 47–50, 59–64.
Article Google Scholar
Liu, D., Chang, T. S., & Zhang, Y. (2002). A constructive algorithm for feed forward neural networks with incremental training. IEEE Transactions on Circuits and Systems, 49, 1876–1879.
Article Google Scholar
Makhoul, J. (1975). Linear prediction: a tutorial review. Proceedings of the IEEE, 63(4), 561–580.
Article Google Scholar
Masmoudi, S., Chtourou, M., & Hamida, A. B. (2009). Isolated word recognition using MLP neural network constructive training algorithm. In Proceeding of the 6 ^th international multi-conference on systems, signals and devices, SSD’09, March 23–26, Djerba, Tunisia (pp. 1–6).
Chapter Google Scholar
Morgan, N., & Bourlard, H. A. (1995). Neural networks for statistical recognition of continuous speech. Proceedings of the IEEE, 83, 742–772.
Article Google Scholar
Puurula, A., & Compernolle, D. V. (2010). Dual stream speech recognition using articulatory syllable models. International Journal of Speech Technology, 13(4), 219–230.
Article Google Scholar
Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77, 257–286.
Article Google Scholar
Sakoe, H., & Chiba, S. (1978). Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, 26(1), 43–49.
Article MATH Google Scholar
Schwenk, H., & Gauvain, J. L. (2002). Connectionist language modeling for large vocabulary continuous speech recognition. In Proceedings of the international conference on acoustics, speech and signal processing (ICASSP’02), May 13–17, Orlando, FL, USA (pp. 765–768).
Google Scholar
Tebelskis, J., & Waibel, A. (1990). Large vocabulary recognition using linked predictive neural networks. In Proceedings of the IEEE international conference acoustic speech signal processing, April 3–6, Albuquerque, NM (pp. 437–440).
Google Scholar
Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., & Lang, K. J. (1989). Phoneme recognition using time-delay neural networks. IEEE Transactions on Acoustic, Speech and Signal Procesing, 37(3), 328–339.
Article Google Scholar
Wang, L., Chen, K., & Chi, H. (2002). Capture interspeaker information with a neural network for speaker identification. IEEE Transactions on Neural Networks, 13, 436–445.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Technologie de l’information et électronique médicale, TIEM – LETI, ENIS, Sfax University, Sfax, Tunisie
Sabeur Masmoudi, Mondher Frikha & Ahmed Ben Hamida
Intelligent Control and Optimized System unit, ICOS, ENIS, Sfax University, Sfax, Tunisie
Mohamed Chtourou

Authors

Sabeur Masmoudi
View author publications
You can also search for this author in PubMed Google Scholar
Mondher Frikha
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Chtourou
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed Ben Hamida
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sabeur Masmoudi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Masmoudi, S., Frikha, M., Chtourou, M. et al. Efficient MLP constructive training algorithm using a neuron recruiting approach for isolated word recognition system. Int J Speech Technol 14, 1–10 (2011). https://doi.org/10.1007/s10772-010-9082-0

Download citation

Received: 22 July 2010
Accepted: 19 November 2010
Published: 27 November 2010
Issue Date: March 2011
DOI: https://doi.org/10.1007/s10772-010-9082-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient MLP constructive training algorithm using a neuron recruiting approach for isolated word recognition system

Abstract

Access this article

Similar content being viewed by others

Isolated Word Recognition Using Enhanced MFCC and IIFs

Isolated Word Recognition Based on Different Statistical Analysis and Feature Selection Technique

Quantitative Analysis of Feature Extraction Techniques for Isolated Word Recognition

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient MLP constructive training algorithm using a neuron recruiting approach for isolated word recognition system

Abstract

Access this article

Similar content being viewed by others

Isolated Word Recognition Using Enhanced MFCC and IIFs

Isolated Word Recognition Based on Different Statistical Analysis and Feature Selection Technique

Quantitative Analysis of Feature Extraction Techniques for Isolated Word Recognition

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation