skip to main content
research-article

CodeFed: Federated Speech Recognition for Low-Resource Code-Switching Detection

Published: 15 January 2024 Publication History

Abstract

One common constraint in the practical application of speech recognition is Code Switching. The issue of code-switched languages is especially aggravated in the context of Indian languages – since most massively multilingual models are trained on corpora that are not representative of the diverse set of Indian languages. An associated constraint with such systems is the privacy-intrusive nature of the applications that aim to collate such representative data. To collectively mitigate both problems, this work presents CodeFed: A federated learning-based code-switching detection model that can be deployed to collaboratively be trained by leveraging private data from multiple users, without compromising their privacy. Using a representative low-resource Indic dataset, we demonstrate the superior performance of a collaboratively trained global model that is trained using federated learning on three low-resource Indic languages – Gujarati, Tamil and Telugu and draw a comparison of the model with respect to the most current work in the field. Finally, to evaluate the practical realizability of the proposed system, CodeFed also discusses the system overview of the label generation architecture which may accompany CodeFed’s possible real-time deployment.

References

[1]
Navya Jose, Bharathi Raja Chakravarthi, Shardul Suryawanshi, Elizabeth Sherly, and John P. McCrae. 2020. A survey of current datasets for code-switching research. In 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS). IEEE, 136–141.
[2]
Cecilia Montes-Alcalá. 2005. Dear amigo: Exploring code-switching in personal letters. In Selected Proceedings of the Second Workshop on Spanish Sociolinguistics. Cascadilla Proceedings Project Somerville, MA, 102–108.
[3]
Injy Hamed, Pavel Denisov, Chia-Yu Li, Mohamed Elmahdy, Slim Abdennadher, and Ngoc Thang Vu. 2022. Investigations on speech recognition systems for low-resource dialectal Arabic–English code-switching speech. Computer Speech & Language 72 (2022), 101278.
[4]
Shuguang Chen, Gustavo Aguilar, Anirudh Srinivasan, Mona Diab, and Thamar Solorio. 2022. CALCS 2021 shared task: Machine translation for code-switched data. arXiv preprint arXiv:2202.09625 (2022).
[5]
Jesin James, Vithya Yogarajan, Isabella Shields, Catherine I. Watson, Peter Keegan, Keoni Mahelona, and Peter-Lucas Jones. 2022. Language models for code-switch detection of te reo Māori and English in a low-resource setting. In Findings of the Association for Computational Linguistics: NAACL 2022. 650–660.
[6]
Garry Kuwanto, Afra Feyza Akyürek, Isidora Chara Tourni, Siyang Li, and Derry Wijaya. 2021. Low-resource machine translation for low-resource languages: Leveraging comparable data, code-switching and compute resources. arXiv preprint arXiv:2103.13272 (2021).
[7]
Andrew Hard, Kanishka Rao, Rajiv Mathews, Françoise Beaufays, Sean Augenstein, Hubert Eichner, Chloé Kiddon, and Daniel Ramage. 2018. Federated Learning for Mobile Keyboard Prediction. (112018).
[8]
Dhruv Guliani, Françoise Beaufays, and Giovanni Motta. 2021. Training speech recognition models with federated learning: A quality/cost framework. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 3080–3084.
[9]
David Leroy, Alice Coucke, Thibaut Lavril, Thibault Gisselbrecht, and Joseph Dureau. 2019. Federated learning for keyword spotting. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 6341–6345.
[10]
Li Nguyen, Zheng Yuan, and Graham Seed. 2022. Building educational technologies for code-switching: Current practices, difficulties and future directions. Languages 7, 3 (2022), 220.
[11]
Shiqiang Wang, Tiffany Tuor, Theodoros Salonidis, Kin K. Leung, Christian Makaya, Ting He, and Kevin Chan. 2019. Adaptive federated learning in resource constrained edge computing systems. IEEE Journal on Selected Areas in Communications 37, 6 (2019), 1205–1221. DOI:
[12]
H. Brendan McMahan, Eider Moore, Daniel Ramage, and Blaise Agüera y Arcas. 2016. Federated learning of deep networks using model averaging. CoRR abs/1602.05629 (2016). arXiv:1602.05629 http://arxiv.org/abs/1602.05629.
[13]
Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. 2019. Federated machine learning: Concept and applications. ACM Trans. Intell. Syst. Technol. 10, 2, Article 12 (Jan. 2019), 19 pages. DOI:
[14]
Dimitrios Dimitriadis, Ken’ichi Kumatani, Robert Gmyr, Yashesh Gaur, and Sefik Emre Eskimez. 2020. Federated transfer learning with dynamic gradient aggregation. CoRR abs/2008.02452 (2020). arXiv:2008.02452 https://arxiv.org/abs/2008.02452.
[15]
Zeynep Yirmibeşoğlu and Gülşen Eryiğit. 2018. Detecting code-switching between Turkish-English language pair. In Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text. Association for Computational Linguistics, Brussels, Belgium, 110–115. DOI:
[16]
Monojit Choudhury, Kalika Bali, Sunayana Sitaram, and Ashutosh Baheti. 2017. Curriculum design for code-switching: Experiments with language identification and language modeling with deep neural networks. In Proceedings of the 14th International Conference on Natural Language Processing (ICON-2017). NLP Association of India, Kolkata, India, 65–74. https://aclanthology.org/W17-7509.
[17]
Yanhua Long, Yijie Li, Qiaozheng Zhang, Shuang Wei, Hong Ye, and Jichen Yang. 2020. Acoustic data augmentation for Mandarin-English code-switching speech recognition. Applied Acoustics 161 (2020), 107175. DOI:
[18]
Kelsey Ball and Dan Garrette. 2018. Part-of-speech tagging for code-switched, transliterated texts without explicit language identification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 3084–3089. DOI:
[19]
Ching-Ting Chang, Shun-Po Chuang, and Hung-yi Lee. 2018. Code-switching sentence generation by generative adversarial networks and its application to data augmentation. CoRR abs/1811.02356 (2018). arXiv:1811.02356 http://arxiv.org/abs/1811.02356.
[20]
Sahoko Nakayama, Andros Tjandra, Sakriani Sakti, and Satoshi Nakamura. 2018. Speech chain for semi-supervised learning of Japanese-English code-switching ASR and TTS. In 2018 IEEE Spoken Language Technology Workshop (SLT). 182–189. DOI:
[21]
Emre Yilmaz, Mitchell McLaren, Henk van den Heuvel, and David A. van Leeuwen. 2018. Semi-supervised acoustic model training for speech with code-switching. CoRR abs/1810.09699 (2018). arXiv:1810.09699 http://arxiv.org/abs/1810.09699.
[22]
Long Duong, Hadi Afshar, Dominique Estival, Glen Pink, Philip Cohen, and Mark Johnson. 2017. Multilingual semantic parsing and code-switching. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017). Association for Computational Linguistics, Vancouver, Canada, 379–389. DOI:
[23]
Gustavo Aguilar and Thamar Solorio. 2020. From English to code-switching: Transfer learning with strong morphological clues. 8033–8044. DOI:
[24]
Gurunath Reddy Madhumani, Sanket Shah, Basil Abraham, Vikas Joshi, and Sunayana Sitaram. 2020. Learning not to Discriminate: Task Agnostic Learning for Improving Monolingual and Code-switched Speech Recognition. (2020). DOI:
[25]
Deepthi Mave, Suraj Maharjan, and Thamar Solorio. 2018. Language identification and analysis of code-switched social media text. In Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching. Association for Computational Linguistics, Melbourne, Australia, 51–61. DOI:
[26]
Manuel Mager, Özlem Çetinoğlu, and Katharina Kann. 2019. Subword-level language identification for intra-word code-switching. 2005–2011. DOI:
[27]
Jesin James, Vithya Yogarajan, Isabella Shields, Catherine I. Watson, Peter Keegan, Keoni Mahelona, and Peter-Lucas Jones. 2020. First workshop on speech processing for code-switching in multilingual communities: Shared task on code-switched spoken language identification. In WSTCSMC 2020. 24.
[28]
J. A. C. Parav Nagarsheth. 2020. Language identification for codemixed Indian languages in the wild. In WSTCSMC 2020. 24.
[29]
A. Patil and D. N. Krishna. 2020. Utterance-level code-switching identification using transformer network. In WSTCSMC 491 (2020), 53.
[30]
Sai Krishna Rallabandi and Alan W. Black. 2020. On detecting code mixing in speech using discrete latent representations. In WSTCSMC 493 (2020), 42.
[31]
Filip Granqvist, Matt Seigel, Rogier van Dalen, Áine Cahill, Stephen Shum, and Matthias Paulik. 2020. Improving on-device speaker verification using federated learning with privacy. arXiv preprint arXiv:2008.02651 (2020).
[32]
Efthymios Tzinis, Jonah Casebeer, Zhepei Wang, and Paris Smaragdis. 2021. Separate but together: Unsupervised federated learning for speech enhancement from non-IID data. In 2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 46–50. DOI:
[33]
Han Zhu, Jindong Wang, Gaofeng Cheng, Pengyuan Zhang, and Yonghong Yan. 2022. Decoupled federated learning for ASR with non-IID data. arXiv preprint arXiv:2206.09102 (2022).
[34]
Ming Liu, Stella Ho, Mengqi Wang, Longxiang Gao, Yuan Jin, and He Zhang. 2021. Federated learning meets natural language processing: A survey. arXiv preprint arXiv:2107.12603 (2021).
[35]
Mingqing Chen, Ananda Theertha Suresh, Rajiv Mathews, Adeline Wong, Cyril Allauzen, Françoise Beaufays, and Michael Riley. 2019. Federated learning of n-gram language models. arXiv preprint arXiv:1910.03432 (2019).
[36]
Hyejun Jeong, Jaeju An, and Jaehoon Jeong. 2021. Are you a good client? Client classification in federated learning. In 2021 International Conference on Information and Communication Technology Convergence (ICTC). 1691–1696. DOI:
[37]
Wentao Yu, Jan Freiwald, Sören Tewes, Fabien Huennemeyer, and Dorothea Kolossa. 2021. Federated learning in ASR: Not as easy as you think. In Speech Communication; 14th ITG Conference. VDE, 1–5.
[38]
Zhenhou Hong, Jianzong Wang, Xiaoyang Qu, Jie Liu, Chendong Zhao, and Jing Xiao. 2021. Federated learning with dynamic transformer for text to speech. arXiv preprint arXiv:2107.08795 (2021).

Index Terms

  1. CodeFed: Federated Speech Recognition for Low-Resource Code-Switching Detection

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 23, Issue 1
      January 2024
      385 pages
      EISSN:2375-4702
      DOI:10.1145/3613498
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 15 January 2024
      Online AM: 17 November 2022
      Accepted: 06 November 2022
      Revised: 15 September 2022
      Received: 27 June 2022
      Published in TALLIP Volume 23, Issue 1

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Code-switching
      2. low resource Indian languages
      3. speech processing
      4. federated learning
      5. mobile computing

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 409
        Total Downloads
      • Downloads (Last 12 months)160
      • Downloads (Last 6 weeks)10
      Reflects downloads up to 02 Mar 2025

      Other Metrics

      Citations

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media