Skip to main content

A Hybrid Multi-layered Speaker Independent Arabic Phoneme Identification System

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3177))

Abstract

A phoneme identification system for Arabic language has been developed. It is based on a hybrid approach that incorporates two levels of phoneme identification. In the first layer power spectral information, efficiently condensed through the use of singular value decomposition, is utilized to train separate self-organizing maps for identifying each Arabic phoneme. This is followed by a second layer of identification, based on similarity metric, that compares the standard pitch contours of phonemes with the pitch contours of the input sound. The second layer performs the identification in case the first layer generates multiple classifications of the same input sound. The system has been developed using utterances of twenty-eight Arabic phonemes from over a hundred speakers. The identification accuracy based on the first layer alone was recorded at 71%, which increased to 91% with the addition of the second identification layer. The introduction of singular values for training instead of power spectral densities directly has resulted in reduction of training and recognition times for self-organizing maps by 80% and 89% respectively. The research concludes that power spectral densities along with the pitch information result in an acceptable and robust identification system for the Arabic phonemes.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Neagoe, V.E., Ropot, A.D.: Concurrent Self-Organizing Maps for Pattern Classification. In: IEEE International Conference on Cognitive Informatics, pp. 304–312 (2002)

    Google Scholar 

  2. Song, H.H., Lee, S.W.: A Self-Organizing Neural Tree for Large-Set Pattern Classification. IEEE Trans. Neural Network. 9(3), 369–379 (1998)

    Article  Google Scholar 

  3. Giurgiu, M.: Self-Organizing Feature Maps and Acoustic Segmentation Applied for Automatic Speech Recognition. In: Proceedings of International Workshop on Speech and Computer. SPECOM 1996, St. Petersburg (1996)

    Google Scholar 

  4. Kohonen, T.: Physiological Interpretation of the Self-Organizing Map Algorithm. Neural Networks 6, 895–905 (1993)

    Google Scholar 

  5. Somervuo, P.: Self-Organizing Maps for Signal and Symbol Sequences. PhD Thesis Helsinki University of Technology. Neural Networks Research Centre (2000)

    Google Scholar 

  6. Díaz, F., Ferrández, J.M., Gómez, P., Rodellar, V., Nieto, V.: Spoken-Digit Recognition using Self-organizing Maps with Perceptual Pre-processing. In: Cabestany, J., Mira, J., Moreno-Díaz, R. (eds.) IWANN 1997. LNCS, vol. 1240, pp. 1203–1212. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  7. Yuk, D.S., Flanagan, J.: Telephone Speech Recognition using Neural Networks and Hidden Markov Models. In: Proceedings of International Conference on Acoustic Speech and Signal Processing. ICASSP 1999, pp. 157–160 (1999)

    Google Scholar 

  8. Renals, S., Morgan, N.: Connectionist probability estimation in HMM speech recognition. Tech. Rep. TR-92-081. International Computer Science Institute. Berkeley CA. USA (1992)

    Google Scholar 

  9. Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition. Prentice Hall Inc., Englewood Cliffs (1993)

    Google Scholar 

  10. Wong, Y.W., Chang, E.: The Effect of Pitch and Lexical Tone on Different Mandarin Speech Recognition Tasks. In: Proceedings of Eurospeech 2001, Aalborg. Denmark, vol. 4, pp. 2741–2744 (2001)

    Google Scholar 

  11. Zue, V.W., Lamel, L.F.: An Expert Spectrogram Reader: A Knowledge-Based Approach to Speech Recognition. In: Proceedings of IEEE International Conference on Acoustic Speech Signal Processing, Tokyo, Japan, pp. 1197–1200 (1986)

    Google Scholar 

  12. Kirchhoff, K.: Novel Speech Recognition Models for Arabic. Johns-Hopkins University. Technical Report (2002)

    Google Scholar 

  13. Kitaoka, N., Yamada, D., Nakagawa, S.: Speaker Independent speech recognition using features based on glottal sound source. In: Proceedings of International Conference on Spoken Language Processing. ICSLP 2002, pp. 2125–2128 (2002)

    Google Scholar 

  14. Praat web page: http://www.praat.org

  15. Kohonen, T.: New Developments and Applications of Self-Organizing Maps. In: Proceedings of the 1996 International Workshop on Neural Networks for Identification, Control, Robotics, and Signal/Image Processing. NICROSP 1996, pp. 164–172 (1996)

    Google Scholar 

  16. Samouelian, A.: Knowledge Based Approach to Consonant Recognition. In: Proceedings of International Conference on Acoustic Speech and Signal Processing ICASSP 1994, pp. 77–80 (1994)

    Google Scholar 

  17. Wooters, C.C., Stolcke, A.: Multiple-pronunciation Lexical Modeling in a Speakerindependent Speech Understanding System. In: Proceedings of Intl. Conf. on Spoken Language Processing, ICSLP 1994, pp. 453–456 (1994)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Awais, M.M., Masud, S., Shamail, S., Akhtar, J. (2004). A Hybrid Multi-layered Speaker Independent Arabic Phoneme Identification System. In: Yang, Z.R., Yin, H., Everson, R.M. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2004. IDEAL 2004. Lecture Notes in Computer Science, vol 3177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28651-6_61

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-28651-6_61

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22881-3

  • Online ISBN: 978-3-540-28651-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics