Abstract
In the last decade speech is a thirsty area of research to the researchers. Man–machine interaction through voice is now making us an efficient and effortless mechanism. In our proposed work of language identification, we have taken the International Institute of Information Technology, Hyderabad (IIIT-H) Indic speech corpus where seven languages have been used and each language has 1000 uttered sentence. Thus, a total of 7000 audio samples have been used in our model of language identification. We have done a pre-processing phase, followed by a pitch and Mel Frequency Cepstral Coefficients (MFCC) feature extraction method and finally a Long Short-Term Memory (LSTM) sequence classification has been used for correct identification of the spoken language and obtained a highest training accuracy of 99.8% for the different hyper-parameters discussed in Sect. 5.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Papacharissi Z (2012) Without you, I’m nothing: Performances of the self on Twitter. Int J Commun 6:18
Zazo R, Lozano-Diez A, Gonzalez-Dominguez J, Toledano DT, Gonzalez-Rodriguez J (2016) Language identification in short utterances using long short-term memory (LSTM) recurrent neural networks. PloS One 11(1):e0146917
Amine A, Elberrichi Z, Simonet M (2010) Automatic language identification: an alternative unsupervised approach using a new hybrid algorithm. IJCSA 7(1):94–107
Padi B, Mohan A, Ganapathy S (2019) End-to-end language recognition using attention based hierarchical gated recurrent unit models. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 5966–5970
Bhanja CC, Laskar MA, Laskar RH, Bandyopadhyay S (2019) Deep neural network based two-stage indian language identification system using glottal closure instants as anchor points. J King Saud Univer Comput Inform Sci
Lopez-Moreno I, Gonzalez-Dominguez J, Plchot O, Martinez D, Gonzalez-Rodriguez J, Moreno P, (2014) Automatic language identification using deep neural networks. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 5337–5341
Geng, Wang W, Zhao Y, Cai X, Xu B, Xinyuan C (2016) End-to-end language identification using attention-based recurrent neural networks
Trong TN, Hautamäki V, Lee KA (2016) Deep Language: a comprehensive deep learning approach to end-to-end language recognition. In: Odyssey, pp 109–116
Bartz C, Herold T, Yang H, Meinel C (2017) Language identification using deep convolutional recurrent neural networks. In: International conference on neural information processing. Springer, Cham, pp 880–889
Mohanty AK, Panda M, Pal R (2010) Language policy in education and classroom practices in India. Negotiating language policies in schools: educators as policymakers, 211–231
Bhatia TK (2007) Advertising & marketing in rural india: language, culture, and communication. Macmillan
Scarr R (1968) Zero crossings as a means of obtaining spectral information in speech analysis. IEEE Trans Audio Electroacoust 16(2):247–255
Prasad B, Prasanna SM (eds) (2007) Speech, audio, image and biomedical signal processing using neural networks, vol 83. Springer
Dave N (2013) Feature extraction methods LPC, PLP and MFCC in speech recognition. Int J Adv Res Eng Technol 1(6):1–4
Palia N, Kant S, Dev A (2019) Performance evaluation of speaker recognition system. J Discrete Math Sci Crypt 22(2):203–218
Sak H, Senior A, Beaufays F (2014) Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In: Fifteenth annual conference of the international speech communication association
Prahallad K, Kumar EN, Keri V, Rajendran S, Black AW (2012) The IIIT-H Indic speech databases. In: Thirteenth annual conference of the international speech communication association
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Paul, B., Phadikar, S., Bera, S. (2021). Indian Regional Spoken Language Identification Using Deep Learning Approach. In: Giri, D., Buyya, R., Ponnusamy, S., De, D., Adamatzky, A., Abawajy, J.H. (eds) Proceedings of the Sixth International Conference on Mathematics and Computing. Advances in Intelligent Systems and Computing, vol 1262. Springer, Singapore. https://doi.org/10.1007/978-981-15-8061-1_21
Download citation
DOI: https://doi.org/10.1007/978-981-15-8061-1_21
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-8060-4
Online ISBN: 978-981-15-8061-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)