Multilingual Speech Identification Framework (MSIF) A Novel Approach in Language Identification

Sawalkar, Swapnil; Roy, Pinki

doi:10.1007/978-3-031-45170-6_75

Swapnil Sawalkar¹² &
Pinki Roy¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14301))

Included in the following conference series:

International Conference on Pattern Recognition and Machine Intelligence

587 Accesses

Abstract

Multilingual language detection is the process of automatically identifying the language(s) present in a given speech corpus that may contain multiple languages. Several approaches have been proposed for multilingual speech corpus detection, including statistical methods, machine learning algorithms, and deep learning models. These models have difficulty determining specific language, especially when dealing with biased towards certain accents, dialects, or languages, and reduce the accuracy of the model. Hence a novel framework named “Multilingual Speech Identification Framework (MSIF)” is developed to solve this problem by data augmentation and increase the accuracy of language identification. There is a limited amount of datasets available for languages except English makes it difficult to train the Indian regional language. So the proposed framework uses a novel Superintendence Neuvised Network, which combines GAN and CNN for data augmentation and transfer learning for feature extraction. The existing multilingual models have been implemented to identify the languages but these models were not able to detect dialect variations because these model does not utilize the attention mechanism. For this reason, the proposed model uses a novel Duel Atenuative memory network, which integrates a Generalized self-attention mechanism with bi-LSTM to understand dialect variations thereby providing better language detection in the Indian regional language.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Galassi, A., Lippi, M., Torroni, P.: Attention in natural language processing. IEEE Trans. Neural Netw. Learn. Syst. 32(10), 4291–4308 (2020)
Article Google Scholar
Kumar Attar, R., Komal: The emergence of natural language processing (NLP) techniques in healthcare AI. In: Parah, S.A., Rashid, M., Varadarajan, V. (eds.) Artificial Intelligence for Innovative Healthcare Informatics, pp. 285–307. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-96569-3_14
Chapter Google Scholar
Dreisbach, C., Koleck, T.A., Bourne, P.E., Bakken, S.: A systematic review of natural language processing and text mining of symptoms from electronic patient-authored text data. Int. J. Med. Inform. 125, 37–46 (2019)
Article Google Scholar
Tyagi, N., Bhushan, B.: Demystifying the role of natural language processing (NLP) in smart city applications: background, motivation, recent advances, and future research directions. Wirel. Pers. Commun. 130(2), 857–908 (2023). https://doi.org/10.1007/s11277-023-10312-8
Article Google Scholar
Dave, E., Suhartono, D., Arymurthy, A.M.: Enhancing argumentation component classification using contextual language model. J. Big Data 8, 1–17 (2021)
Google Scholar
Malte, A., Ratadiya, P.: Multilingual cyber abuse detection using advanced transformer architecture. In: 2019 IEEE Region 10 Conference (TENCON), TENCON 2019, pp. 784–789. IEEE (2019)
Google Scholar
Al-Smadi, M., Al-Zboon, S., Jararweh, Y., Juola, P.: Transfer learning for Arabic named entity recognition with deep neural networks. IEEE Access 8, 37736–37745 (2020)
Article Google Scholar
Cabot, C., Darmoni, S., Soualmia, L.F.: Cimind: a phonetic-based tool for multilingual named entity recognition in biomedical texts. J. Biomed. Inform. 94, 103176 (2019)
Article Google Scholar
Das, H.S., Roy, P.: A CNN-BiLSTM based hybrid model for Indian language identification. Appl. Acoust. 182, 108274 (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

Sipna College of Engineering and Technology Amravati, Amravati, M.S., India
Swapnil Sawalkar
National Institute of Technology Silchar, Silchar, Assam, India
Pinki Roy

Authors

Swapnil Sawalkar
View author publications
You can also search for this author in PubMed Google Scholar
Pinki Roy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Swapnil Sawalkar .

Editor information

Editors and Affiliations

Indian Statistical Institute, Kolkata, India
Pradipta Maji
Texas A&M University at Qatar, Doha, Qatar
Tingwen Huang
Indian Statistical Institute, Kolkata, West Bengal, India
Nikhil R. Pal
Indian Institute of Technology Jodhpur, Jodhpur, India
Santanu Chaudhury
Indian Statistical Institute, Kolkata, West Bengal, India
Rajat K. De

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sawalkar, S., Roy, P. (2023). Multilingual Speech Identification Framework (MSIF) A Novel Approach in Language Identification. In: Maji, P., Huang, T., Pal, N.R., Chaudhury, S., De, R.K. (eds) Pattern Recognition and Machine Intelligence. PReMI 2023. Lecture Notes in Computer Science, vol 14301. Springer, Cham. https://doi.org/10.1007/978-3-031-45170-6_75

Download citation

DOI: https://doi.org/10.1007/978-3-031-45170-6_75
Published: 04 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-45169-0
Online ISBN: 978-3-031-45170-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics