Indian Regional Spoken Language Identification Using Deep Learning Approach

Paul, Bachchu; Phadikar, Santanu; Bera, Somnath

doi:10.1007/978-981-15-8061-1_21

Bachchu Paul²⁰,
Santanu Phadikar²¹ &
Somnath Bera²⁰

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1262))

293 Accesses
9 Citations

Abstract

In the last decade speech is a thirsty area of research to the researchers. Man–machine interaction through voice is now making us an efficient and effortless mechanism. In our proposed work of language identification, we have taken the International Institute of Information Technology, Hyderabad (IIIT-H) Indic speech corpus where seven languages have been used and each language has 1000 uttered sentence. Thus, a total of 7000 audio samples have been used in our model of language identification. We have done a pre-processing phase, followed by a pitch and Mel Frequency Cepstral Coefficients (MFCC) feature extraction method and finally a Long Short-Term Memory (LSTM) sequence classification has been used for correct identification of the spoken language and obtained a highest training accuracy of 99.8% for the different hyper-parameters discussed in Sect. 5.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Papacharissi Z (2012) Without you, I’m nothing: Performances of the self on Twitter. Int J Commun 6:18
Google Scholar
https://www.cs.cmu.edu/~ref/mlim/chapter7.html
Zazo R, Lozano-Diez A, Gonzalez-Dominguez J, Toledano DT, Gonzalez-Rodriguez J (2016) Language identification in short utterances using long short-term memory (LSTM) recurrent neural networks. PloS One 11(1):e0146917
Google Scholar
Amine A, Elberrichi Z, Simonet M (2010) Automatic language identification: an alternative unsupervised approach using a new hybrid algorithm. IJCSA 7(1):94–107
Google Scholar
Padi B, Mohan A, Ganapathy S (2019) End-to-end language recognition using attention based hierarchical gated recurrent unit models. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 5966–5970
Google Scholar
Bhanja CC, Laskar MA, Laskar RH, Bandyopadhyay S (2019) Deep neural network based two-stage indian language identification system using glottal closure instants as anchor points. J King Saud Univer Comput Inform Sci
Google Scholar
Lopez-Moreno I, Gonzalez-Dominguez J, Plchot O, Martinez D, Gonzalez-Rodriguez J, Moreno P, (2014) Automatic language identification using deep neural networks. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 5337–5341
Google Scholar
Geng, Wang W, Zhao Y, Cai X, Xu B, Xinyuan C (2016) End-to-end language identification using attention-based recurrent neural networks
Google Scholar
Trong TN, Hautamäki V, Lee KA (2016) Deep Language: a comprehensive deep learning approach to end-to-end language recognition. In: Odyssey, pp 109–116
Google Scholar
Bartz C, Herold T, Yang H, Meinel C (2017) Language identification using deep convolutional recurrent neural networks. In: International conference on neural information processing. Springer, Cham, pp 880–889
Google Scholar
Mohanty AK, Panda M, Pal R (2010) Language policy in education and classroom practices in India. Negotiating language policies in schools: educators as policymakers, 211–231
Google Scholar
Bhatia TK (2007) Advertising & marketing in rural india: language, culture, and communication. Macmillan
Google Scholar
Scarr R (1968) Zero crossings as a means of obtaining spectral information in speech analysis. IEEE Trans Audio Electroacoust 16(2):247–255
Article Google Scholar
Prasad B, Prasanna SM (eds) (2007) Speech, audio, image and biomedical signal processing using neural networks, vol 83. Springer
Google Scholar
Dave N (2013) Feature extraction methods LPC, PLP and MFCC in speech recognition. Int J Adv Res Eng Technol 1(6):1–4
Google Scholar
Palia N, Kant S, Dev A (2019) Performance evaluation of speaker recognition system. J Discrete Math Sci Crypt 22(2):203–218
Google Scholar
Sak H, Senior A, Beaufays F (2014) Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In: Fifteenth annual conference of the international speech communication association
Google Scholar
Prahallad K, Kumar EN, Keri V, Rajendran S, Black AW (2012) The IIIT-H Indic speech databases. In: Thirteenth annual conference of the international speech communication association
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Vidyasagar University, Midnapore, 721102, West Bengal, India
Bachchu Paul & Somnath Bera
Department of Computer Science and Engineering, Maulana Abul Kalam Azad University of Technology, Kolkata, 700064, India
Santanu Phadikar

Authors

Bachchu Paul
View author publications
You can also search for this author in PubMed Google Scholar
Santanu Phadikar
View author publications
You can also search for this author in PubMed Google Scholar
Somnath Bera
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bachchu Paul .

Editor information

Editors and Affiliations

Department of Information Technology, Maulana Abul Kalam Azad University of Technology, Haringhata, West Bengal, India
Debasis Giri
School of Computing and Information Systems, University of Melbourne, Melbourne, VIC, Australia
Rajkumar Buyya
Department of Mathematics, Indian Institute of Technology Madras, Chennai, India
S. Ponnusamy
Department of Computer Science and Engineering, Maulana Abul Kalam Azad University of Technology, Haringhata, West Bengal, India
Debashis De
Unconventional Computing Laboratory, Department of Computer Science and Creative Technologies, University of the West of England, Bristol, UK
Andrew Adamatzky
Faculty of Science, Engineering and Built Environment, Deakin University Geelong, Geelong, VIC, Australia
Jemal H. Abawajy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Paul, B., Phadikar, S., Bera, S. (2021). Indian Regional Spoken Language Identification Using Deep Learning Approach. In: Giri, D., Buyya, R., Ponnusamy, S., De, D., Adamatzky, A., Abawajy, J.H. (eds) Proceedings of the Sixth International Conference on Mathematics and Computing. Advances in Intelligent Systems and Computing, vol 1262. Springer, Singapore. https://doi.org/10.1007/978-981-15-8061-1_21

Download citation

DOI: https://doi.org/10.1007/978-981-15-8061-1_21
Published: 11 December 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-8060-4
Online ISBN: 978-981-15-8061-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics