research-article

CodeFed: Federated Speech Recognition for Low-Resource Code-Switching Detection

Authors:

Harshita Diddee,

Mamta MittalAuthors Info & Claims

ACM Transactions on Asian and Low-Resource Language Information Processing, Volume 23, Issue 1

Article No.: 2, Pages 1 - 14

https://doi.org/10.1145/3571732

Published: 15 January 2024 Publication History

Abstract

One common constraint in the practical application of speech recognition is Code Switching. The issue of code-switched languages is especially aggravated in the context of Indian languages – since most massively multilingual models are trained on corpora that are not representative of the diverse set of Indian languages. An associated constraint with such systems is the privacy-intrusive nature of the applications that aim to collate such representative data. To collectively mitigate both problems, this work presents CodeFed: A federated learning-based code-switching detection model that can be deployed to collaboratively be trained by leveraging private data from multiple users, without compromising their privacy. Using a representative low-resource Indic dataset, we demonstrate the superior performance of a collaboratively trained global model that is trained using federated learning on three low-resource Indic languages – Gujarati, Tamil and Telugu and draw a comparison of the model with respect to the most current work in the field. Finally, to evaluate the practical realizability of the proposed system, CodeFed also discusses the system overview of the label generation architecture which may accompany CodeFed’s possible real-time deployment.

References

[1]

Navya Jose, Bharathi Raja Chakravarthi, Shardul Suryawanshi, Elizabeth Sherly, and John P. McCrae. 2020. A survey of current datasets for code-switching research. In 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS). IEEE, 136–141.

[2]

Cecilia Montes-Alcalá. 2005. Dear amigo: Exploring code-switching in personal letters. In Selected Proceedings of the Second Workshop on Spanish Sociolinguistics. Cascadilla Proceedings Project Somerville, MA, 102–108.

[3]

Injy Hamed, Pavel Denisov, Chia-Yu Li, Mohamed Elmahdy, Slim Abdennadher, and Ngoc Thang Vu. 2022. Investigations on speech recognition systems for low-resource dialectal Arabic–English code-switching speech. Computer Speech & Language 72 (2022), 101278.

Digital Library

[4]

Shuguang Chen, Gustavo Aguilar, Anirudh Srinivasan, Mona Diab, and Thamar Solorio. 2022. CALCS 2021 shared task: Machine translation for code-switched data. arXiv preprint arXiv:2202.09625 (2022).

[5]

Jesin James, Vithya Yogarajan, Isabella Shields, Catherine I. Watson, Peter Keegan, Keoni Mahelona, and Peter-Lucas Jones. 2022. Language models for code-switch detection of te reo Māori and English in a low-resource setting. In Findings of the Association for Computational Linguistics: NAACL 2022. 650–660.

[6]

Garry Kuwanto, Afra Feyza Akyürek, Isidora Chara Tourni, Siyang Li, and Derry Wijaya. 2021. Low-resource machine translation for low-resource languages: Leveraging comparable data, code-switching and compute resources. arXiv preprint arXiv:2103.13272 (2021).

[7]

Andrew Hard, Kanishka Rao, Rajiv Mathews, Françoise Beaufays, Sean Augenstein, Hubert Eichner, Chloé Kiddon, and Daniel Ramage. 2018. Federated Learning for Mobile Keyboard Prediction. (112018).

[8]

Dhruv Guliani, Françoise Beaufays, and Giovanni Motta. 2021. Training speech recognition models with federated learning: A quality/cost framework. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 3080–3084.

[9]

David Leroy, Alice Coucke, Thibaut Lavril, Thibault Gisselbrecht, and Joseph Dureau. 2019. Federated learning for keyword spotting. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 6341–6345.

[10]

Li Nguyen, Zheng Yuan, and Graham Seed. 2022. Building educational technologies for code-switching: Current practices, difficulties and future directions. Languages 7, 3 (2022), 220.

[11]

Shiqiang Wang, Tiffany Tuor, Theodoros Salonidis, Kin K. Leung, Christian Makaya, Ting He, and Kevin Chan. 2019. Adaptive federated learning in resource constrained edge computing systems. IEEE Journal on Selected Areas in Communications 37, 6 (2019), 1205–1221. DOI:

[12]

H. Brendan McMahan, Eider Moore, Daniel Ramage, and Blaise Agüera y Arcas. 2016. Federated learning of deep networks using model averaging. CoRR abs/1602.05629 (2016). arXiv:1602.05629 http://arxiv.org/abs/1602.05629.

[13]

Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. 2019. Federated machine learning: Concept and applications. ACM Trans. Intell. Syst. Technol. 10, 2, Article 12 (Jan. 2019), 19 pages. DOI:

Digital Library

[14]

Dimitrios Dimitriadis, Ken’ichi Kumatani, Robert Gmyr, Yashesh Gaur, and Sefik Emre Eskimez. 2020. Federated transfer learning with dynamic gradient aggregation. CoRR abs/2008.02452 (2020). arXiv:2008.02452 https://arxiv.org/abs/2008.02452.

[15]

Zeynep Yirmibeşoğlu and Gülşen Eryiğit. 2018. Detecting code-switching between Turkish-English language pair. In Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text. Association for Computational Linguistics, Brussels, Belgium, 110–115. DOI:

[16]

Monojit Choudhury, Kalika Bali, Sunayana Sitaram, and Ashutosh Baheti. 2017. Curriculum design for code-switching: Experiments with language identification and language modeling with deep neural networks. In Proceedings of the 14th International Conference on Natural Language Processing (ICON-2017). NLP Association of India, Kolkata, India, 65–74. https://aclanthology.org/W17-7509.

[17]

Yanhua Long, Yijie Li, Qiaozheng Zhang, Shuang Wei, Hong Ye, and Jichen Yang. 2020. Acoustic data augmentation for Mandarin-English code-switching speech recognition. Applied Acoustics 161 (2020), 107175. DOI:

[18]

Kelsey Ball and Dan Garrette. 2018. Part-of-speech tagging for code-switched, transliterated texts without explicit language identification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 3084–3089. DOI:

[19]

Ching-Ting Chang, Shun-Po Chuang, and Hung-yi Lee. 2018. Code-switching sentence generation by generative adversarial networks and its application to data augmentation. CoRR abs/1811.02356 (2018). arXiv:1811.02356 http://arxiv.org/abs/1811.02356.

[20]

Sahoko Nakayama, Andros Tjandra, Sakriani Sakti, and Satoshi Nakamura. 2018. Speech chain for semi-supervised learning of Japanese-English code-switching ASR and TTS. In 2018 IEEE Spoken Language Technology Workshop (SLT). 182–189. DOI:

[21]

Emre Yilmaz, Mitchell McLaren, Henk van den Heuvel, and David A. van Leeuwen. 2018. Semi-supervised acoustic model training for speech with code-switching. CoRR abs/1810.09699 (2018). arXiv:1810.09699 http://arxiv.org/abs/1810.09699.

[22]

Long Duong, Hadi Afshar, Dominique Estival, Glen Pink, Philip Cohen, and Mark Johnson. 2017. Multilingual semantic parsing and code-switching. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017). Association for Computational Linguistics, Vancouver, Canada, 379–389. DOI:

[23]

Gustavo Aguilar and Thamar Solorio. 2020. From English to code-switching: Transfer learning with strong morphological clues. 8033–8044. DOI:

[24]

Gurunath Reddy Madhumani, Sanket Shah, Basil Abraham, Vikas Joshi, and Sunayana Sitaram. 2020. Learning not to Discriminate: Task Agnostic Learning for Improving Monolingual and Code-switched Speech Recognition. (2020). DOI:

[25]

Deepthi Mave, Suraj Maharjan, and Thamar Solorio. 2018. Language identification and analysis of code-switched social media text. In Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching. Association for Computational Linguistics, Melbourne, Australia, 51–61. DOI:

[26]

Manuel Mager, Özlem Çetinoğlu, and Katharina Kann. 2019. Subword-level language identification for intra-word code-switching. 2005–2011. DOI:

[27]

Jesin James, Vithya Yogarajan, Isabella Shields, Catherine I. Watson, Peter Keegan, Keoni Mahelona, and Peter-Lucas Jones. 2020. First workshop on speech processing for code-switching in multilingual communities: Shared task on code-switched spoken language identification. In WSTCSMC 2020. 24.

[28]

J. A. C. Parav Nagarsheth. 2020. Language identification for codemixed Indian languages in the wild. In WSTCSMC 2020. 24.

[29]

A. Patil and D. N. Krishna. 2020. Utterance-level code-switching identification using transformer network. In WSTCSMC 491 (2020), 53.

[30]

Sai Krishna Rallabandi and Alan W. Black. 2020. On detecting code mixing in speech using discrete latent representations. In WSTCSMC 493 (2020), 42.

[31]

Filip Granqvist, Matt Seigel, Rogier van Dalen, Áine Cahill, Stephen Shum, and Matthias Paulik. 2020. Improving on-device speaker verification using federated learning with privacy. arXiv preprint arXiv:2008.02651 (2020).

[32]

Efthymios Tzinis, Jonah Casebeer, Zhepei Wang, and Paris Smaragdis. 2021. Separate but together: Unsupervised federated learning for speech enhancement from non-IID data. In 2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 46–50. DOI:

[33]

Han Zhu, Jindong Wang, Gaofeng Cheng, Pengyuan Zhang, and Yonghong Yan. 2022. Decoupled federated learning for ASR with non-IID data. arXiv preprint arXiv:2206.09102 (2022).

[34]

Ming Liu, Stella Ho, Mengqi Wang, Longxiang Gao, Yuan Jin, and He Zhang. 2021. Federated learning meets natural language processing: A survey. arXiv preprint arXiv:2107.12603 (2021).

[35]

Mingqing Chen, Ananda Theertha Suresh, Rajiv Mathews, Adeline Wong, Cyril Allauzen, Françoise Beaufays, and Michael Riley. 2019. Federated learning of n-gram language models. arXiv preprint arXiv:1910.03432 (2019).

[36]

Hyejun Jeong, Jaeju An, and Jaehoon Jeong. 2021. Are you a good client? Client classification in federated learning. In 2021 International Conference on Information and Communication Technology Convergence (ICTC). 1691–1696. DOI:

[37]

Wentao Yu, Jan Freiwald, Sören Tewes, Fabien Huennemeyer, and Dorothea Kolossa. 2021. Federated learning in ASR: Not as easy as you think. In Speech Communication; 14th ITG Conference. VDE, 1–5.

[38]

Zhenhou Hong, Jianzong Wang, Xiaoyang Qu, Jie Liu, Chendong Zhao, and Jing Xiao. 2021. Federated learning with dynamic transformer for text to speech. arXiv preprint arXiv:2107.08795 (2021).

Index Terms

CodeFed: Federated Speech Recognition for Low-Resource Code-Switching Detection
1. Computing methodologies
  1. Machine learning
    1. Learning settings
2. Security and privacy
  1. Security services
    1. Privacy-preserving protocols

Recommendations

Investigations on speech recognition systems for low-resource dialectal Arabic–English code-switching speech
Abstract
Code-switching (CS), defined as the mixing of languages in conversations, has become a worldwide phenomenon. The prevalence of CS has been recently met with a growing demand and interest to build CS automatic speech recognition (ASR) ...
Highlights
- Building the first ASR system and releasing the first speech corpus for the Egyptian Arabic–English language pair.
Code-switched automatic speech recognition in five South African languages
Abstract
Most automatic speech recognition (ASR) systems are optimised for one specific language and their performance consequently deteriorates drastically when confronted with multilingual or code-switched speech. We describe our efforts to ...
Highlights
- Addressed different aspects of ASR for South African code-switched speech.
- Four ...
Pronunciation augmentation for Mandarin-English code-switching speech recognition
Abstract
Code-switching (CS) refers to the phenomenon of using more than one language in an utterance, and it presents great challenge to automatic speech recognition (ASR) due to the code-switching property in one utterance, the pronunciation variation ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 23, Issue 1

January 2024

385 pages

EISSN:2375-4702

DOI:10.1145/3613498

Editor:
Imed Zitoun
Google, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 January 2024

Online AM: 17 November 2022

Accepted: 06 November 2022

Revised: 15 September 2022

Received: 27 June 2022

Published in TALLIP Volume 23, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
409
Total Downloads

Downloads (Last 12 months)160
Downloads (Last 6 weeks)10

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents