Arabic Phoneme Identification Using Conventional and Concurrent Neural Networks in Non Native Speakers

Awais, Mian M.; Masud, Shahid; Ahktar, Junaid; Shamail, Shafay

doi:10.1007/978-3-540-74171-8_90

Mian M. Awais¹,
Shahid Masud¹,
Junaid Ahktar¹ &
…
Shafay Shamail¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4681))

Included in the following conference series:

International Conference on Intelligent Computing

1165 Accesses

Abstract

Traditional speech recognition systems have relied on power spectral densities, Mel-frequency cepstral, linear prediction coding and formant analysis. This paper introduces two novel input feature sets and their extraction methods for intelligent phoneme identification. These input sets are based on intrinsic phonetic characteristics of Arabic speech comprising of the dimensionally reduced Power Spectral Densities (DPSD) and Location, Trend, Gradient (LTG) values of the captured speech signal spectrum. These characteristics have been subsequently utilized as inputs to four different neural network based recognition classifiers. The classifiers have been tested for twenty-eight Arabic phonemes utterances from over one hundred non-native speakers. The results obtained using the proposed feature sets have been compared and it has been observed that LTG based input feature set provides an average phoneme identification accuracy of 86% as compared to 70% obtained through applying DPSD based inputs for similar classifiers. It is worthwhile to note that the methods proposed in this paper are generic and are equally applicable to other regional languages such as Persian and Urdu.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Arabic isolated word recognition system using hybrid feature extraction techniques and neural network

Article 23 November 2017

A comparative study for Arabic speech recognition system in noisy environments

Article 27 April 2021

An experimental framework for Arabic digits speech recognition in noisy environments

Article 03 February 2017

References

Kirchhoff, K.: Novel Speech Recognition Models for Arabic, Johns-Hopkins University, Technical Report (2002) http://www.clsp.jhu.edu/ws2002/groups/arabic/ available on April 28, 2005
Kohonen, T.: Physiological Interpretation of the Self Organizing Map Algorithm. IEEE Transactions on Neural Networks 6, 895–905 (1993)
Google Scholar
Kohonen, T.: New Developments and Applications of Self-Organizing Maps. In: NICROSP. Proceedings of International Workshop on Neural Networks for Identification, Control, Robotics, and Signal/Image Processing, Venice, pp. 164–172. IEEE Computer Society Press, Los Alamitos (1996)
Google Scholar
Specht, D.F.: A general regression neural network. IEEE Transactions on Neural Networks 2(6), 568–576 (1991)
Article Google Scholar
Neagoe, V.E., Ropot, A.D.: Concurrent Self-Organizing Maps for Pattern Classification. In: IEEE International Conference on Cognitive Informatics, pp. 304-312 (2002)
Google Scholar
Somervuo, P.: Self-Organizing Maps for Signal and Symbol Sequences, PhD Thesis Helsinki University of Technology, Neural Networks Research Centre (2000)
Google Scholar
Díaz, F., Ferrández, J.M., Gómez, P., Rodellar, V., Nieto, V.: Spoken-Digit Recognition using Self-organizing Maps with Perceptual Pre-processing. In: IWANN’97, International Work-Conference on Artificial and Natural Neural Networks, Spain, pp. 1203-1212 (1997)
Google Scholar
Samouelian, A.: Knowledge Based Approach to Consonant Recognition. In: ICASSP ’94. Proceedings of International Conference on Acoustic Speech and Signal Processing, pp. 77–80 (1994)
Google Scholar
Wooters, C., Stolcke, A.: Multiple-Pronunciation Lexical Modeling in A Speaker Independent Speech Understanding System. In: ICSLP ’94. Proceedings of International Conference on Spoken Language Processing, pp. 453–456 (1994)
Google Scholar
Astrid, H., Andrew, M.: Recent Advances in the Multi-stream HMM/ANN Hybrid Approach to Noise Robust ASR. Computer Speech & Language 19(1), 3–30 (2005)
Article Google Scholar
Yuk, D., Flanagan, J.: Telephone Speech Recognition using Neural Networks and Hidden Markov Models. In: ICASSP ’99, Proceedings of International Conference on Acoustic Speech and Signal Processing, pp. 157–160 (1999)
Google Scholar
Wong, E., Sridharan, S.: Comparison of Linear Prediction Cepstrum Coefficients and Mel-Frequency Cepstrum Coefficients for Language Identification. In: International Symposium on Intelligent Multimedia, Video and Speech Processing, Hong Kong, pp. 95–98 (2001)
Google Scholar
He, J.L., Liu, L., Palm, G.: On the Use of Residual Cepstrum in Speech Recognition. In: ICASSP’96. IEEE Proceedings of International Conference on Acoustic Speech and Signal Processing, vol.1, Atlanta, USA, pp. 5-8 (1996)
Google Scholar
Rudzionis, A., Rudzionis, V.: Phoneme Recognition in Fixed Context Using Regularized Discriminant Analysis. In: Proceedings of 6th European Conference on Speech Communication and Technology, Eurospeech’99, Budapest, Hungary, pp. 2745–2748 (1999)
Google Scholar
Campbell, W.M., Campbell, J.P., Reynolds, D.A., Singer, E., Torres-Carrasquillo, P.A.: Support Vector Machines for Speaker and Language Recognition. Computer Speech & Language 20(2-3), 210–229 (2006)
Article Google Scholar
Gales, M.J.F., Airey, S.S.: Product of Gaussians for Speech Recognition. Computer Speech & Language 20(1), 22–40 (2006)
Article Google Scholar
Yousif, A.E.: Phonetization of Arabic: Rules and Algorithms. Computer Speech & Language 18(4), 339–373 (2004)
Article Google Scholar
Wiki_Arabic: http://en.wikibooks.org/wiki/Arabic/ Arabic_sounds
Yousef, A.A.: High Performance Arabic Digits Recognizer Using Neural Networks. In: The 2003 International Joint Conference on Neural Networks-IJCNN2003, Portland, Oregon (2003)
Google Scholar
Yousef, A.A.: Analyzing Arabic Digit Recognizer Errors Using Spectrograms. In: ICSP ’04. Proceedings of 7th International Conference on Signal Processing, pp. 646-650 (2004)
Google Scholar
Debyeche, H.M.: A Knowledge-based Approach for Arabic Amphatic Consonant Identification Based on Speech Spectrogram Reading. In: Proceedings of the Thirtieth Southeastern Symposium on System Theory, pp. 325 – 328 (1998)
Google Scholar
Jalaleddin, A., Egidio, M.: The Status of Vowels in Jordanian and Moroccan Arabic: Insights from Production and Perception. The Journal of the Acous. Soc. of America (2004)
Google Scholar
Mohamed, A., Ratree, W.: Acoustic Characteristics of Fricatives in Arabic. The Journal of the Acous. Soc. of America, 2655-2689 (2001)
Google Scholar
Zue, V.W., Lamel, L.F.: An Expert Spectrogram Reader: A Knowledge-Based Approach to Speech Recognition. In: Proceedings of International Conference on Acoustic Speech and Signal Processing ICASSP ’86, Japan, pp. 1197-1200 (April 1986)
Google Scholar
Ingle, V.K., Proakis, J.G.: Digital Signal Processing Using MATLAB, Thomson Engineering (1999)
Google Scholar
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C: the Art of Scientific Computing, 2nd edn. Cambridge University Press, Cambridge (1992)
Google Scholar
McCandless, S.: An Agorithm for Automatic Formant Extraction Using Linear Prediction Spectra. IEEE Transactions of Acoustic, Speech and Signal Processing ASSP-22(2), 135–141 (1974)
Article Google Scholar
Rabiner, L., Juang, B.: Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs, New Jersey (1993)
Google Scholar
Awais, M.M., Masud, S., Shamail, S., Akhtar, J.: A Hybrid Multi-Layered Speaker Independent Arabic Phoneme Identification System. In: Yang, Z.R., Yin, H., Everson, R.M. (eds.) IDEAL 2004. LNCS, vol. 3177, pp. 23–25. Springer, Heidelberg (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Lahore University of Management Sciences, Sector-U, D.H.A., Lahore 54792, Pakistan
Mian M. Awais, Shahid Masud, Junaid Ahktar & Shafay Shamail

Authors

Mian M. Awais
View author publications
You can also search for this author in PubMed Google Scholar
Shahid Masud
View author publications
You can also search for this author in PubMed Google Scholar
Junaid Ahktar
View author publications
You can also search for this author in PubMed Google Scholar
Shafay Shamail
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

De-Shuang Huang Laurent Heutte Marco Loog

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Awais, M.M., Masud, S., Ahktar, J., Shamail, S. (2007). Arabic Phoneme Identification Using Conventional and Concurrent Neural Networks in Non Native Speakers. In: Huang, DS., Heutte, L., Loog, M. (eds) Advanced Intelligent Computing Theories and Applications. With Aspects of Theoretical and Methodological Issues. ICIC 2007. Lecture Notes in Computer Science, vol 4681. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74171-8_90

Download citation

DOI: https://doi.org/10.1007/978-3-540-74171-8_90
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74170-1
Online ISBN: 978-3-540-74171-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics