Skip to main content

Arabic Phoneme Identification Using Conventional and Concurrent Neural Networks in Non Native Speakers

  • Conference paper
Advanced Intelligent Computing Theories and Applications. With Aspects of Theoretical and Methodological Issues (ICIC 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4681))

Included in the following conference series:

Abstract

Traditional speech recognition systems have relied on power spectral densities, Mel-frequency cepstral, linear prediction coding and formant analysis. This paper introduces two novel input feature sets and their extraction methods for intelligent phoneme identification. These input sets are based on intrinsic phonetic characteristics of Arabic speech comprising of the dimensionally reduced Power Spectral Densities (DPSD) and Location, Trend, Gradient (LTG) values of the captured speech signal spectrum. These characteristics have been subsequently utilized as inputs to four different neural network based recognition classifiers. The classifiers have been tested for twenty-eight Arabic phonemes utterances from over one hundred non-native speakers. The results obtained using the proposed feature sets have been compared and it has been observed that LTG based input feature set provides an average phoneme identification accuracy of 86% as compared to 70% obtained through applying DPSD based inputs for similar classifiers. It is worthwhile to note that the methods proposed in this paper are generic and are equally applicable to other regional languages such as Persian and Urdu.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kirchhoff, K.: Novel Speech Recognition Models for Arabic, Johns-Hopkins University, Technical Report (2002) http://www.clsp.jhu.edu/ws2002/groups/arabic/ available on April 28, 2005

  2. Kohonen, T.: Physiological Interpretation of the Self Organizing Map Algorithm. IEEE Transactions on Neural Networks 6, 895–905 (1993)

    Google Scholar 

  3. Kohonen, T.: New Developments and Applications of Self-Organizing Maps. In: NICROSP. Proceedings of International Workshop on Neural Networks for Identification, Control, Robotics, and Signal/Image Processing, Venice, pp. 164–172. IEEE Computer Society Press, Los Alamitos (1996)

    Google Scholar 

  4. Specht, D.F.: A general regression neural network. IEEE Transactions on Neural Networks 2(6), 568–576 (1991)

    Article  Google Scholar 

  5. Neagoe, V.E., Ropot, A.D.: Concurrent Self-Organizing Maps for Pattern Classification. In: IEEE International Conference on Cognitive Informatics, pp. 304-312 (2002)

    Google Scholar 

  6. Somervuo, P.: Self-Organizing Maps for Signal and Symbol Sequences, PhD Thesis Helsinki University of Technology, Neural Networks Research Centre (2000)

    Google Scholar 

  7. Díaz, F., Ferrández, J.M., Gómez, P., Rodellar, V., Nieto, V.: Spoken-Digit Recognition using Self-organizing Maps with Perceptual Pre-processing. In: IWANN’97, International Work-Conference on Artificial and Natural Neural Networks, Spain, pp. 1203-1212 (1997)

    Google Scholar 

  8. Samouelian, A.: Knowledge Based Approach to Consonant Recognition. In: ICASSP ’94. Proceedings of International Conference on Acoustic Speech and Signal Processing, pp. 77–80 (1994)

    Google Scholar 

  9. Wooters, C., Stolcke, A.: Multiple-Pronunciation Lexical Modeling in A Speaker Independent Speech Understanding System. In: ICSLP ’94. Proceedings of International Conference on Spoken Language Processing, pp. 453–456 (1994)

    Google Scholar 

  10. Astrid, H., Andrew, M.: Recent Advances in the Multi-stream HMM/ANN Hybrid Approach to Noise Robust ASR. Computer Speech & Language 19(1), 3–30 (2005)

    Article  Google Scholar 

  11. Yuk, D., Flanagan, J.: Telephone Speech Recognition using Neural Networks and Hidden Markov Models. In: ICASSP ’99, Proceedings of International Conference on Acoustic Speech and Signal Processing, pp. 157–160 (1999)

    Google Scholar 

  12. Wong, E., Sridharan, S.: Comparison of Linear Prediction Cepstrum Coefficients and Mel-Frequency Cepstrum Coefficients for Language Identification. In: International Symposium on Intelligent Multimedia, Video and Speech Processing, Hong Kong, pp. 95–98 (2001)

    Google Scholar 

  13. He, J.L., Liu, L., Palm, G.: On the Use of Residual Cepstrum in Speech Recognition. In: ICASSP’96. IEEE Proceedings of International Conference on Acoustic Speech and Signal Processing, vol.1, Atlanta, USA, pp. 5-8 (1996)

    Google Scholar 

  14. Rudzionis, A., Rudzionis, V.: Phoneme Recognition in Fixed Context Using Regularized Discriminant Analysis. In: Proceedings of 6th European Conference on Speech Communication and Technology, Eurospeech’99, Budapest, Hungary, pp. 2745–2748 (1999)

    Google Scholar 

  15. Campbell, W.M., Campbell, J.P., Reynolds, D.A., Singer, E., Torres-Carrasquillo, P.A.: Support Vector Machines for Speaker and Language Recognition. Computer Speech & Language 20(2-3), 210–229 (2006)

    Article  Google Scholar 

  16. Gales, M.J.F., Airey, S.S.: Product of Gaussians for Speech Recognition. Computer Speech & Language 20(1), 22–40 (2006)

    Article  Google Scholar 

  17. Yousif, A.E.: Phonetization of Arabic: Rules and Algorithms. Computer Speech & Language 18(4), 339–373 (2004)

    Article  Google Scholar 

  18. Wiki_Arabic: http://en.wikibooks.org/wiki/Arabic/ Arabic_sounds

  19. Yousef, A.A.: High Performance Arabic Digits Recognizer Using Neural Networks. In: The 2003 International Joint Conference on Neural Networks-IJCNN2003, Portland, Oregon (2003)

    Google Scholar 

  20. Yousef, A.A.: Analyzing Arabic Digit Recognizer Errors Using Spectrograms. In: ICSP ’04. Proceedings of 7th International Conference on Signal Processing, pp. 646-650 (2004)

    Google Scholar 

  21. Debyeche, H.M.: A Knowledge-based Approach for Arabic Amphatic Consonant Identification Based on Speech Spectrogram Reading. In: Proceedings of the Thirtieth Southeastern Symposium on System Theory, pp. 325 – 328 (1998)

    Google Scholar 

  22. Jalaleddin, A., Egidio, M.: The Status of Vowels in Jordanian and Moroccan Arabic: Insights from Production and Perception. The Journal of the Acous. Soc. of America (2004)

    Google Scholar 

  23. Mohamed, A., Ratree, W.: Acoustic Characteristics of Fricatives in Arabic. The Journal of the Acous. Soc. of America, 2655-2689 (2001)

    Google Scholar 

  24. Zue, V.W., Lamel, L.F.: An Expert Spectrogram Reader: A Knowledge-Based Approach to Speech Recognition. In: Proceedings of International Conference on Acoustic Speech and Signal Processing ICASSP ’86, Japan, pp. 1197-1200 (April 1986)

    Google Scholar 

  25. Ingle, V.K., Proakis, J.G.: Digital Signal Processing Using MATLAB, Thomson Engineering (1999)

    Google Scholar 

  26. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C: the Art of Scientific Computing, 2nd edn. Cambridge University Press, Cambridge (1992)

    Google Scholar 

  27. McCandless, S.: An Agorithm for Automatic Formant Extraction Using Linear Prediction Spectra. IEEE Transactions of Acoustic, Speech and Signal Processing ASSP-22(2), 135–141 (1974)

    Article  Google Scholar 

  28. Rabiner, L., Juang, B.: Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs, New Jersey (1993)

    Google Scholar 

  29. Awais, M.M., Masud, S., Shamail, S., Akhtar, J.: A Hybrid Multi-Layered Speaker Independent Arabic Phoneme Identification System. In: Yang, Z.R., Yin, H., Everson, R.M. (eds.) IDEAL 2004. LNCS, vol. 3177, pp. 23–25. Springer, Heidelberg (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

De-Shuang Huang Laurent Heutte Marco Loog

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Awais, M.M., Masud, S., Ahktar, J., Shamail, S. (2007). Arabic Phoneme Identification Using Conventional and Concurrent Neural Networks in Non Native Speakers. In: Huang, DS., Heutte, L., Loog, M. (eds) Advanced Intelligent Computing Theories and Applications. With Aspects of Theoretical and Methodological Issues. ICIC 2007. Lecture Notes in Computer Science, vol 4681. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74171-8_90

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74171-8_90

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74170-1

  • Online ISBN: 978-3-540-74171-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics