Abstract
Focusing on the development of new technologies of information, research in the speech communication field is an activity in full expansion. Several disciplines and skills interact in order to improve performance of Human Machine Communication Systems (HMC). In order to increase the performance of these systems, various techniques, including Hidden Markov Models (HMM) and Neural Network (NN), are implemented.
In this paper, we advance a new approach for modelling of acoustic units and a new method for speech recognition, especially recognition of Arabic word, adapting to this new type of modelling based on Wavelet Network (WN). The new recognition system is a hybrid classifier. It is based on NN as a general model and the wavelets assume the role of activation function.
Our approach of speech recognition is divided into two parts: training, and recognition phases. The training stage is based on audio corpus. After converting all training signals from original format to a specific parameterisation, each acoustic vector will be modelled by WN. These vectors will refine and cover all signal properties in one model. It consists in generating a WN for every training signal. The recognition phase is divided into three steps. The first is to extract features from the input vector to be recognized. The second is to estimate all resulting vectors from training WN. The third is to evaluate the distance between the vector to be recognized and the reconstructed vectors.
The obtained results shows that our system, based on WN, is very competitive compared to systems based on HMM.
Similar content being viewed by others
References
Bahi, H., & Sellami, M. (2001). Combination of vector quantization and hidden Markov models for Arabic speech recognition. In Proceeding ACS/IEEE international conference on computer systems and applications (pp. 96–100). Beirut, Liban, Juin 2001.
Bahi, H., Benouareth, A., & Sellami, M. (2000). Application of HMMs for Arabic speech recognition. In Proceeding of Maghreb conference MCSEAI’2000 (pp. 379–388). Fes, Maroc, November 2000.
Baker, J.M. (1975). The DRAGON system—an overview. IEEE Transactions on Acoustics, Speech, and Signal Processing, 23(1), 24–29.
Bakis, R. (1976). Continuous speech recognition via centisecond acoustic states. In Proc. 91st meeting of the acoustic society in America.
Balh, L. R., Jelinek, F., & Mercer, L. (1983). A maximum likelihood approach to continuous speech recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-5, 179–190.
Baloul, S. (2003). Development of an automatic synthesis of speech from vowelized Arabic Standard text. PhD thesis, University of Maine.
Ben Amar, C., & Jemai, O. (2005). Wavelet networks approach for image compression. ICGST International Journal on Graphics, Vision and Image Processing, SI1, 37–45.
Ben Amar, C., Zaied, M., & Alimi, M. A. (2005). Beta wavelets. Synthesis and application to lossy image compression. Journal of Advances in Engineering Software, 36(7), 459–474.
Boudraa, B., & Boudraa, M. (1998). Twenty list of ten Arabic sentences for assessment. Acustica. Acta Acoustica, 86, 870–882.
Bouselmi, G. (2008). Contributions to automatic recognition of not native speech. Ph.D. from the University Henri Poincare Nancy 1, Lorraine Laboratory for Research in Computing and its Applications, UMR 7503.
Cai, J., Bouselmi, G., Laprie, Y., & Haton, J. P. (2008) Efficient likelihood evaluation and dynamic Gaussian selection for hmm-based speech recognition. In Computer speech and language—CSL.
Chien, J. T., & Chueh, C. H. (2009). Joint acoustic and language modeling for speech recognition. Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan 70101, Taiwan, ROC. Speech Communication.
Daubechies, I. (1992). Ten lectures on wavelets. Philadelphia: Society of Industrial and Applied Mathematics.
Daugman, J. (2003). Demodulation by complex-valued wavelets for stochastic pattern recognition. International Journal of Wavelets, Multi-resolution and Information Processing, 1(1), 1–17.
Ejbali, R., Benayed, Y., & Alimi, A.M. (2009a). Arabic continues speech recognition system using context-independent. In 6th International multi-conference on systems, signals and devices, Jerba, Tunisie, Marsh 2009.
Ejbali, R., Benayed, Y., Zaied, M., & Alimi, A.M. (2009b) Wavelet networks for phonemes recognition. In International conference on systems and information processing, Guelma, Algeria, May 2009.
Flego, F., & Gales, M. J. F. (2006). Discriminative classifiers with adaptive kernels for noise robust speech recognition. Speech Communication, 48, 1037–1046.
Iyengar, S. S., Cho, E. C., & Phoha, V. (2002). Foundations of wavelet networks and applications. London: Chapman and Hall/CRC Press.
Jelinek, F. (1976). Continuous speech recognition by statistical methods. Proceedings of the IEEE, 64(4), 532–536.
Jelinek, F. (2001). Aspects of the statistical approach to speech recognition. In IEEE international symposium on information theory, Washington D.C., Juin 2001.
Kruger, V., & Sommer, G. (2001). Gabor wavelet networks for object representation (Technical Report CS-TR-4245). University of Maryland, CFAR.
Morlet, J., Arehs, G., Fourgeau, I., & Giard, D. (1982). Wave propagation and sampling theory (p. 203).
Pati, Y. C., & Krishnaprasad, P. S. (1993). Analysis and synthesis of feedforward neural networks using discrete affine wavelet transformations. IEEE Transactions on Neural Networks, 4(1), 73–85.
Postalcioglu, S., & Becerikli, Y. (2005). Nonlinear system modelling using wavelet networks. In Lecture notes in computer science (LNCS) (Vol. 3497, pp. 411–417).
Rabiner, L. R., & Juang, B. H. (2006). Speech recognition: Statistical methods. In Encyclopedia of language & linguistics (2nd ed.) (pp. 1–18).
Szu, H., Telfer, B., & Kadambe, S. (1992). Neural network adaptative wavelets for signal representation and classification. Optical Engineering, 31, 1907–1961.
Young, S., et al. (2005). The HTK book (for HTK version 3.3). Cambridge University Engineering Department.
Zaied, M., Ben Amar, C., & Alimi, A. M. (2003). Award a new wavelet based beta function. In International conference on signal, system and design, SSD03 (Vol. 1, pp. 185–191), Tunisia, Mars 2003.
Zaied, M., Ben Amar, C., & Alimi, A. M. (2005). Beta wavelet networks for face recognition. Journal of Decision Systems—New Trends in the Design of Intelligent Decision Systems, 14, 109–122.
Zaied, M., Jemai, O., & Ben Amar, C. (2008). Training of the beta wavelet networks by the frames theory: Application to face recognition. In The international workshops on image processing theory, tools and applications, Tunisia, November 2008.
Zhang, Z. (2007). Learning algorithm of wavelet network based on sampling theory. Neurocomputing, 71, 244–269.
Zhang, Z. (2009). Iterative algorithm of wavelet network learning from non uniform data. Neurocomputing, 72, 2979–2999.
Zhang, Q., & Benveniste, A. (1992). Wavelet networks. IEEE Transactions on Neural Networks, 3(6), 889–898.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ejbali, R., Zaied, M. & Ben Amar, C. Wavelet network for recognition system of Arabic word. Int J Speech Technol 13, 163–174 (2010). https://doi.org/10.1007/s10772-010-9076-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-010-9076-y