Abstract
Digit recognition has a vital use in multiple human-machine interaction applications. It is used in telephone-based services, such as dialing systems, airline reservation systems, different bank transactions, and price extraction. This research aims to develop a new Convolution Neural Network (CNN) based spoken digits recognition system for the Arabic digits. The developed system used a classification approach to perform the recognition task. First, the Mel frequency cepstral coefficients of the spoken digits were conducted and reduced in the convolution phase. Then in the classification phase, the most appropriate digit label for the testing utterances is produced. The proposed approach has shown a remarkable performance when compared to similar systems. The recognition system achieved a 99% correct digit recognition compared to 98% using Recurrent Neural Networks based digit recognition system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
https://web.archive.org/web/20190907141114/, https://www.internetworldstats.com/stats7.htm
Yin, W., et al.: Comparative study of CNN and RNN for natural language processing. arXiv preprint arXiv:1702.01923 (2017)
Alotaibi, Y.A.: Investigating spoken Arabic digits in speech recognition setting. Inf. Sci. 173(1–3), 115–139 (2005)
Alotaibi, Y.A.: Comparative study of ANN and HMM to Arabic digits recognition systems. Eng. Sci. 19(1) (2008)
Satori, H., Harti, M., Chenfour, N.: Introduction to Arabic speech recognition using CMUSphinx system. arXiv preprint arXiv:0704.2083 (2007)
Daqrouq, K., et al.: Wavelet LPC with neural network for spoken Arabic digits recognition system. Curr. J. Appl. Sci. Technol. 4, 1238–1255 (2014)
Hammami, N., Sellam, M.: Tree distribution classifier for automatic spoken Arabic digit recognition. In: 2009 International Conference for Internet Technology and Secured Transactions, (ICITST). IEEE (2009)
Hammami, N., Bedda, M.: Improved tree model for arabic speech recognition. In: 2010 3rd International Conference on Computer Science and Information Technology, vol. 5. IEEE (2010)
Alotaibi, Y.A.: Spoken Arabic digits recognizer using recurrent neural networks. In: Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology 2004. IEEE (2004)
Zerari, N., et al.: Bi-directional recurrent end-to-end neural network classifier for spoken Arab digit recognition. In: 2018 2nd International Conference on Natural Language and Speech Processing (ICNLSP). IEEE (2018)
Zada, B., Ullah, R.: Pashto isolated digits recognition using deep convolutional neural network. Heliyon 6(2), e03372 (2020)
Sharmin, R., Rahut, S.K., Huq, M.R.: Bengali spoken digit classification: a deep learning approach using convolutional neural network. Procedia Comput. Sci. 171, 1381–1388 (2020)
Dalsaniya, N., Mankad, S.H., Garg, S., Shrivastava, D.: Development of a novel database in Gujarati language for spoken digits classification. In: International Symposium on Signal Processing and Intelligent Recognition Systems, pp. 208–219. Springer, Singapore, December 2019
Palaz, D., Magimai.-Doss, M., Collobert, R.: Convolutional neural networks-based continuous speech recognition using raw speech signal. In: Proceedings of ICASSP, April 2015
Sainath, T.N., Mohamed, A., Kingsbury, B., Ramabhadran, B.: Deep convolutional neural networks for LVCSR. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, pp. 8614–8618 (2013). https://doi.org/10.1109/ICASSP.2013.6639347
Abdel-Hamid, O., et al.: Convolutional neural networks for speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(10), 1533–1545 (2014)
Palaz, D., Magimai.-Doss, M., Collobert, R.: Analysis of CNN-based speech recognition system using raw speech as input (2015)
Dua, D., Graff, C.: UCI Machine Learning Repositor. University of California, School of Information and Computer Science, Irvine, CA (2019). https://archive.ics.uci.edu/ml
Jiang, H.: Confidence measures for speech recognition: a survey. Speech Commun. 45(4), 455–470 (2005)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 ( 2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Azim, M.A., Hussein, W., Badr, N.L. (2021). Spoken Arabic Digits Recognition System Using Convolutional Neural Network. In: Hassanien, AE., Chang, KC., Mincong, T. (eds) Advanced Machine Learning Technologies and Applications. AMLTA 2021. Advances in Intelligent Systems and Computing, vol 1339. Springer, Cham. https://doi.org/10.1007/978-3-030-69717-4_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-69717-4_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69716-7
Online ISBN: 978-3-030-69717-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)