Comparing ANN to HMM in implementing limited Arabic vocabulary ASR systems

Alotaibi, Yousef Ajami

doi:10.1007/s10772-011-9107-3

Comparing ANN to HMM in implementing limited Arabic vocabulary ASR systems

Published: 01 September 2011

Volume 15, pages 25–32, (2012)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Yousef Ajami Alotaibi¹

207 Accesses
5 Citations
Explore all metrics

Abstract

In this paper we investigated Artificial Neural Networks (ANN) based Automatic Speech Recognition (ASR) by using limited Arabic vocabulary corpora. These limited Arabic vocabulary subsets are digits and vowels carried by specific carrier words. In addition to this, Hidden Markov Model (HMM) based ASR systems are designed and compared to two ANN based systems, namely Multilayer Perceptron (MLP) and recurrent architectures, by using the same corpora. All systems are isolated word speech recognizers. The ANN based recognition system achieved 99.5% correct digit recognition. On the other hand, the HMM based recognition system achieved 98.1% correct digit recognition. With vowels carrier words, the MLP and recurrent ANN based recognition systems achieved 92.13% and 98.06, respectively, correct vowel recognition; but the HMM based recognition system achieved 91.6% correct vowel recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Abdulah, W., Abdul-Karim, M. (1985). Real-time spoken Arabic recognizer. International Journal of Electronics, 59(5), 645–648.
Article Google Scholar
Alghamdi, M. M. (1998). A spectrographic analysis of Arabic vowels: a cross-dialect study. Journal of King Saud University, 10(Arts1), 3–24.
MathSciNet Google Scholar
Alghamdi, M. (2001). Arabic phonetics. Riyadh: Al-Toubah Bookshop (in Arabic).
Google Scholar
Alkhouli, M. (1990). Alaswaat Alaghawaiyah. Daar Alfalah: Jordan (in Arabic).
Google Scholar
Alotaibi, Y. A. (2003). High performance Arabic digits recognizer using neural networks. In The 2003 international joint conference on neural networks—IJCNN2003, Portland, Oregon.
Google Scholar
Deller, J., Proakis, J., & Hansen, J. H. (1993). Discrete-time processing of speech signal. New York: Macmillan Co.
Google Scholar
El-Imam, Y. A. (1989). An unrestricted vocabulary Arabic speech synthesis system. IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(12), 1829–1845.
Article Google Scholar
Elshafei, M. (1991). Toward an Arabic text-to-speech system. The Arabian Journal for Science and Engineering, 16(4B), 565–583.
MathSciNet Google Scholar
Hagos, E. (1985). Implementation of an isolated word recognition system. UMI Dissertation Service.
Haykin, S. (1999). Neural networks: a comprehensive foundation (2nd ed.). New York: Prentice Hall.
MATH Google Scholar
Iqbal, H. R., Awais, M. M., Masud, S., & Shamail, S. (2008). On vowels segmentation and identification using formant transitions in continuous recitation of Quranic Arabic. In New challenges in applied intelligence technologies (pp. 155–162) Berlin: Springer.
Chapter Google Scholar
Juang, B., & Rabiner, L. (1991). Hidden Markov models for speech recognition. Technometrics, 33(3), 251–272.
Article MathSciNet MATH Google Scholar
Kirchhoff, K., Bilmes, J., Das, S., Duta, N., Egan, M., Gang, J., Feng, H., Henderson, J., Daben, L., Noamany, M., Schone, P., Schwartz, R., & Vergyri, D. (2003). Novel approaches to Arabic speech recognition: report from the 2002 Johns-Hopkins Summer Workshop. In Proceedings of ICASSP 2003, April 2003 (Vol. 1, pp. 344–347).
Google Scholar
Linguistic Data Consortium (LDC) (2002). Catalog number LDC2002S02, http://www.ldc.upenn.edu/.
Lippmann, R. (1989). Review of neural networks for speech recognition. Neural computation (pp. 1–38). Cambridge: MIT Press.
Google Scholar
Loizou, P. C., & Spanias, A. S. (1996). High-performance alphabet recognition. IEEE Transactions on Speech and Audio Processing, 4(6), 430–445.
Article Google Scholar
Newman, D. L., & Verhoeven, J. (2002). Frequency analysis of Arabic vowels in connected speech. Antwerp Papers in Linguistics, 100, 77–87.
Google Scholar
Omar, A. (1991). Derasat Alaswat Aloghawi. Egypt: Aalam Alkutob (in Arabic).
Google Scholar
Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.
Article Google Scholar
Razak, Z., Ibrahim, N. J., Tamil, E. M., Idris, M. Y. I., Yakub, M., & Yusoff, Z. B. M. (2008). Quranic verse recitation feature extraction using mel-frequency cepstral coefficient (MFCC). In Proceedings of the 4th IEEE international colloquium on signal processing and its application (CSPA), 7–9 March 2008, Kuala Lumpur, Malaysia.
Google Scholar
Tolba, M. F., Nazmy, T., Abdelhamid, A. A., & Gadallah, M. E. (2005). A novel method for Arabic consonant/vowel segmentation using wavelet transform. International Journal on Intelligent Cooperative Information Systems, IJICIS, 5(1), 353–364.
Google Scholar
Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., & Woodland, P. (2006). The HTK book (for HTK version. 3.4). Cambridge: Cambridge University Engineering Department. http:///htk.eng.cam.ac.uk/prot-doc/ktkbook.pdf.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Engineering Dept., College of Computer & Information Sciences, King Saud University, P.O. Box 57168, Riyadh, 11574, Saudi Arabia
Yousef Ajami Alotaibi

Authors

Yousef Ajami Alotaibi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yousef Ajami Alotaibi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alotaibi, Y.A. Comparing ANN to HMM in implementing limited Arabic vocabulary ASR systems. Int J Speech Technol 15, 25–32 (2012). https://doi.org/10.1007/s10772-011-9107-3

Download citation

Received: 07 June 2011
Accepted: 04 August 2011
Published: 01 September 2011
Issue Date: March 2012
DOI: https://doi.org/10.1007/s10772-011-9107-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparing ANN to HMM in implementing limited Arabic vocabulary ASR systems

Abstract

Access this article

Similar content being viewed by others

Biologically inspired Continuous Arabic Speech Recognition

Deep Convolutional Neural Network for Arabic Speech Recognition

A comparative study for Arabic speech recognition system in noisy environments

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Comparing ANN to HMM in implementing limited Arabic vocabulary ASR systems

Abstract

Access this article

Similar content being viewed by others

Biologically inspired Continuous Arabic Speech Recognition

Deep Convolutional Neural Network for Arabic Speech Recognition

A comparative study for Arabic speech recognition system in noisy environments

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation