A Hybrid Multi-layered Speaker Independent Arabic Phoneme Identification System

Awais, Mian M.; Masud, Shahid; Shamail, Shafay; Akhtar, J.

doi:10.1007/978-3-540-28651-6_61

Mian M. Awais¹⁹,
Shahid Masud¹⁹,
Shafay Shamail¹⁹ &
…
J. Akhtar¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3177))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

1336 Accesses

Abstract

A phoneme identification system for Arabic language has been developed. It is based on a hybrid approach that incorporates two levels of phoneme identification. In the first layer power spectral information, efficiently condensed through the use of singular value decomposition, is utilized to train separate self-organizing maps for identifying each Arabic phoneme. This is followed by a second layer of identification, based on similarity metric, that compares the standard pitch contours of phonemes with the pitch contours of the input sound. The second layer performs the identification in case the first layer generates multiple classifications of the same input sound. The system has been developed using utterances of twenty-eight Arabic phonemes from over a hundred speakers. The identification accuracy based on the first layer alone was recorded at 71%, which increased to 91% with the addition of the second identification layer. The introduction of singular values for training instead of power spectral densities directly has resulted in reduction of training and recognition times for self-organizing maps by 80% and 89% respectively. The research concludes that power spectral densities along with the pitch information result in an acceptable and robust identification system for the Arabic phonemes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Arabic phonemes recognition using hybrid LVQ/HMM model for continuous speech recognition

Article 20 May 2016

Study for Automatic Classification of Arabic Spoken Documents

Arabic isolated word recognition system using hybrid feature extraction techniques and neural network

Article 23 November 2017

References

Neagoe, V.E., Ropot, A.D.: Concurrent Self-Organizing Maps for Pattern Classification. In: IEEE International Conference on Cognitive Informatics, pp. 304–312 (2002)
Google Scholar
Song, H.H., Lee, S.W.: A Self-Organizing Neural Tree for Large-Set Pattern Classification. IEEE Trans. Neural Network. 9(3), 369–379 (1998)
Article Google Scholar
Giurgiu, M.: Self-Organizing Feature Maps and Acoustic Segmentation Applied for Automatic Speech Recognition. In: Proceedings of International Workshop on Speech and Computer. SPECOM 1996, St. Petersburg (1996)
Google Scholar
Kohonen, T.: Physiological Interpretation of the Self-Organizing Map Algorithm. Neural Networks 6, 895–905 (1993)
Google Scholar
Somervuo, P.: Self-Organizing Maps for Signal and Symbol Sequences. PhD Thesis Helsinki University of Technology. Neural Networks Research Centre (2000)
Google Scholar
Díaz, F., Ferrández, J.M., Gómez, P., Rodellar, V., Nieto, V.: Spoken-Digit Recognition using Self-organizing Maps with Perceptual Pre-processing. In: Cabestany, J., Mira, J., Moreno-Díaz, R. (eds.) IWANN 1997. LNCS, vol. 1240, pp. 1203–1212. Springer, Heidelberg (1997)
Chapter Google Scholar
Yuk, D.S., Flanagan, J.: Telephone Speech Recognition using Neural Networks and Hidden Markov Models. In: Proceedings of International Conference on Acoustic Speech and Signal Processing. ICASSP 1999, pp. 157–160 (1999)
Google Scholar
Renals, S., Morgan, N.: Connectionist probability estimation in HMM speech recognition. Tech. Rep. TR-92-081. International Computer Science Institute. Berkeley CA. USA (1992)
Google Scholar
Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition. Prentice Hall Inc., Englewood Cliffs (1993)
Google Scholar
Wong, Y.W., Chang, E.: The Effect of Pitch and Lexical Tone on Different Mandarin Speech Recognition Tasks. In: Proceedings of Eurospeech 2001, Aalborg. Denmark, vol. 4, pp. 2741–2744 (2001)
Google Scholar
Zue, V.W., Lamel, L.F.: An Expert Spectrogram Reader: A Knowledge-Based Approach to Speech Recognition. In: Proceedings of IEEE International Conference on Acoustic Speech Signal Processing, Tokyo, Japan, pp. 1197–1200 (1986)
Google Scholar
Kirchhoff, K.: Novel Speech Recognition Models for Arabic. Johns-Hopkins University. Technical Report (2002)
Google Scholar
Kitaoka, N., Yamada, D., Nakagawa, S.: Speaker Independent speech recognition using features based on glottal sound source. In: Proceedings of International Conference on Spoken Language Processing. ICSLP 2002, pp. 2125–2128 (2002)
Google Scholar
Praat web page: http://www.praat.org
Kohonen, T.: New Developments and Applications of Self-Organizing Maps. In: Proceedings of the 1996 International Workshop on Neural Networks for Identification, Control, Robotics, and Signal/Image Processing. NICROSP 1996, pp. 164–172 (1996)
Google Scholar
Samouelian, A.: Knowledge Based Approach to Consonant Recognition. In: Proceedings of International Conference on Acoustic Speech and Signal Processing ICASSP 1994, pp. 77–80 (1994)
Google Scholar
Wooters, C.C., Stolcke, A.: Multiple-pronunciation Lexical Modeling in a Speakerindependent Speech Understanding System. In: Proceedings of Intl. Conf. on Spoken Language Processing, ICSLP 1994, pp. 453–456 (1994)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Lahore University of Management Sciences, Sector-U, D.H.A., Lahore, 54792, Pakistan
Mian M. Awais, Shahid Masud, Shafay Shamail & J. Akhtar

Authors

Mian M. Awais
View author publications
You can also search for this author in PubMed Google Scholar
Shahid Masud
View author publications
You can also search for this author in PubMed Google Scholar
Shafay Shamail
View author publications
You can also search for this author in PubMed Google Scholar
J. Akhtar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Engineering, Computing, and Mathematics, University of Exeter, EX4 4QF, Exeter, UK
Zheng Rong Yang
School of Electrical and Electronic Engineering, University of Manchester, UK
Hujun Yin
School of Engineering, Computer Science and Mathematics, University of Exeter, EX4 4QF, UK
Richard M. Everson

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Awais, M.M., Masud, S., Shamail, S., Akhtar, J. (2004). A Hybrid Multi-layered Speaker Independent Arabic Phoneme Identification System. In: Yang, Z.R., Yin, H., Everson, R.M. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2004. IDEAL 2004. Lecture Notes in Computer Science, vol 3177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28651-6_61

Download citation

DOI: https://doi.org/10.1007/978-3-540-28651-6_61
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22881-3
Online ISBN: 978-3-540-28651-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics