Arabic phonemes recognition using hybrid LVQ/HMM model for continuous speech recognition

Nahar, Khalid M. O.; Abu Shquier, Mohammed; Al-Khatib, Wasfi G.; Al-Muhtaseb, Husni; Elshafei, Moustafa

doi:10.1007/s10772-016-9337-5

Arabic phonemes recognition using hybrid LVQ/HMM model for continuous speech recognition

Published: 20 May 2016

Volume 19, pages 495–508, (2016)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Khalid M. O. Nahar¹,
Mohammed Abu Shquier²,
Wasfi G. Al-Khatib³,
Husni Al-Muhtaseb³ &
…
Moustafa Elshafei⁴

400 Accesses
Explore all metrics

Abstract

In attempt to increase the rate of Arabic phonemes recognition, we introduce a novel hybrid recognition algorithm. The algorithm is composed of the learning vector quantization (LVQ) and hidden Markov model (HMM). The hybrid algorithm used to recognizing Arabic phonemes in continuous open-vocabulary speech. A recorded Arabic corpus of different TV news for modern standard Arabic was used for training and testing purposes. We employ a data driven approach to generate the training feature vectors that embed the frame neighboring correlation information. Next, we generate the phonemes codebooks using the K-means splitting algorithm. Then, we trained the generated codebooks using the LVQ algorithm. We achieved a performance of 98.49 % during independent classification training and 90 % during dependent classification training. When using the trained LVQ codebooks in Arabic utterance transcription, the phoneme recognition rate was 72 % using LVQ only. We combined the LVQ codebooks with the single state HMM model using enhanced Viterbi algorithm which includes the phonemes bigrams. We achieved 89 % of Arabic phonemes recognition rate based on the hybrid LVQ/HMM algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tree-Based HMM State Tying for Arabic Continuous Speech Recognition

An experimental framework for Arabic digits speech recognition in noisy environments

Article 03 February 2017

TAMEEM V1.0: speakers and text independent Arabic automatic continuous speech recognizer

Article 24 February 2017

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

Except for the first 3 frames and the last 3 frames in the feature matrix.

References

AbuZeina, D., & Al-Khatib, W. (2012). Within-word pronunciation variation modeling for Arabic ASRs: A direct data-driven approach. International Journal of Speech Technology, 15(2), 65–75.
Article Google Scholar
Ali, M., & Elshafei, M. (2009). Arabic phonetic dictionaries for speech recognition. Journal of Information Technology, 2(80), 67–80.
Article Google Scholar
Al-Manie, M., Alkanhal, M., & Al-Ghamdi, M. (2010). Arabic speech segmentation: Automatic verses manual method and zero crossing measurements. Indian Journal of Science and Technology, 3, 1134–1138.
Google Scholar
Avdagic, Z., Nuhic, A., & Konjicija, S. (2007). Phoneme recognition as a member of predefined class using hybrid cascaded LVQ/elman neural network. In 2007 IEEE International Conference on Signal Processing and Communications, (pp. 1195–1198).
Cosi, P., Frasconi, P., Gori, M., Lastrucci, L., & Soda, G. (2000). Competitive radial basis functions training for phone classification. Neurocomputing, 34(1–4), 117–129.
Article MATH Google Scholar
Essa, E., Tolba, A., & Elmougy, S. (2008). Combined classifier based Arabic speech recognition. In Proceedings of the 2008 IEEE International Conference on Computer Engineering & Systems.
Gemmeke, J., ten Bosch, L., Boves, L., & Cranen, B. (2009). Using sparse representations for exemplar based continuous digit recognition. In Proceeding of the EUSIPCO, (pp. 24–28).
Gürgen, F., Alpaydin, R., Ünlüakin, U., & Alpaydin, E. (1994). Distributed and local neural classifiers for phoneme recognition†. Pattern Recognition Letters, 15(11), 1111–1118.
Article Google Scholar
Kohonen, T. (1988). Self-organization and associative memory (2nd ed., pp. 199–202). Berlin: Springer.
Book MATH Google Scholar
Kondo, K., Kamata, H., & Ishida, Y. (1994). Speaker-independent spoken digits recognition using LVQ. In Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN’94), (Vol. 7, pp. 4448–4451).
Kumpf, K., & King, R. (1996). Automatic accent classification of foreign accented Australian English speech. In Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP ’96,( Vol. 3, pp. 1740–1743).
Kurimo, M. (1997). Training mixture density HMMs with SOM and LVQ. Computer Speech & Language, 11(4), 321–343.
Article Google Scholar
Lamere, P., Kwok, P., & Walker, W. (2003). Design of the CMU Sphinx-4 decoder. In Eurospeech.
Ma, D., & ZENG, X. (2012). An improved VQ based algorithm for recognizing speaker-independent isolated words. In 2012 International Conference on Machine Learning and Cybernetics, pp. 792–796.
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of 5-th Berkeley symposium on Mathematical Statistics and Probability, (Vol. 1, pp. 281–297).
Mäntysalo, J., Torkkola, K., & Kohonen, T. (1994). Mapping content dependent acoustic information into context independent form by LVQ. Speech Communication, 14(2), 119–130.
Article Google Scholar
McDermott, E., & Katagiri, S. (1991). LVQ-based shift-tolerant phoneme recognition. Signal Processing, IEEE Transactions, 39(6), 1398–1411.
Article Google Scholar
Nahar, K., Elshafei, M., & Al-Khatib, W. (2012). Statistical analysis of Arabic phonemes for continuous Arabic speech recognition. International Journal of Computer and Information Technology, 1(2), 49–61.
Google Scholar
Prasad, T., & Kohli, M.(2010). Vector quantization of microarray gene expression data. In Proceedings of the World Congress on Engineering.
Selouani, S., & Caelen, J. (1999). A hybrid learning vector quantization/time-delay neural networks system for the recognition of arabic speech. In Proceedings of the IEEE-EURASIP Workshop on Nonlinear Signal and Image Processing (NSIP’99), (Vol. 2, pp. 709–713).
Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., & Lang, K. J. (1989). Phoneme recognition using time-delay neural networks. IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(3), 328–339.
Article Google Scholar
Yokota, M., Katagiri, S., & McDermott, E. (1988). Learning in an LVQ based phoneme recognition system. (7E/CE Technical Report, SP88-104).

Download references

Author information

Authors and Affiliations

Computer Science Department, Faculty of Computer Sciences and Information Technology, Yarmouk University, Irbid, 21163, Jordan
Khalid M. O. Nahar
Computer Science Department, Faculty of Computer Science and Information Technology, Jarash University, Jarash, Jordan
Mohammed Abu Shquier
Information & Computer Science Department, King Fahd University of Petroleum and Minerals, Dhahran, 31261, Kingdom of Saudi Arabia
Wasfi G. Al-Khatib & Husni Al-Muhtaseb
Systems Engineering Department, King Fahd University of Petroleum and Minerals, Dhahran, 31261, Kingdom of Saudi Arabia
Moustafa Elshafei

Authors

Khalid M. O. Nahar
View author publications
You can also search for this author inPubMed Google Scholar
Mohammed Abu Shquier
View author publications
You can also search for this author inPubMed Google Scholar
Wasfi G. Al-Khatib
View author publications
You can also search for this author inPubMed Google Scholar
Husni Al-Muhtaseb
View author publications
You can also search for this author inPubMed Google Scholar
Moustafa Elshafei
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Khalid M. O. Nahar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nahar, K.M.O., Abu Shquier, M., Al-Khatib, W.G. et al. Arabic phonemes recognition using hybrid LVQ/HMM model for continuous speech recognition. Int J Speech Technol 19, 495–508 (2016). https://doi.org/10.1007/s10772-016-9337-5

Download citation

Received: 29 October 2015
Accepted: 26 February 2016
Published: 20 May 2016
Issue Date: September 2016
DOI: https://doi.org/10.1007/s10772-016-9337-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Arabic phonemes recognition using hybrid LVQ/HMM model for continuous speech recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Tree-Based HMM State Tying for Arabic Continuous Speech Recognition

An experimental framework for Arabic digits speech recognition in noisy environments

TAMEEM V1.0: speakers and text independent Arabic automatic continuous speech recognizer

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now