Influence of Accented Speech in Automatic Speech Recognition: A Case Study on Assamese L1 Speakers Speaking Code Switched Hindi-English

Chakraborty, Joyshree; Sinha, Rohit; Sarmah, Priyankoo

doi:10.1007/978-3-031-20980-2_9

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13721))

Included in the following conference series:

International Conference on Speech and Computer

1086 Accesses
2 Citations

Abstract

The goal of this work is to show the influence of accented speech on state-of-the-art speech-to-text (S2T) systems. In the current study, Assamese accented Hindi-English (AAHE) code-switched speech samples and Native Hindi-English (NHE) code-switched speech samples are subjected to four commercial S2T systems. The results of this study found that the word error rate averaged across the four systems is found to be 27.33% and 38.35% for the NHE and AAHE groups, respectively. This performance gap is mainly attributed to substitution errors. On further analysis, it was found that those errors resulted from the distinct phonetic and phonological properties of the Assamese language. Thus, there is a scope for accent adaptation even in concurrent S2T systems supporting Indian languages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Enhancing Automatic Speech Recognition for Punjabi Dialects: An Experimental Analysis of Incorporating Prosodic Features and Acoustic Variability Mitigation

Article 01 August 2024

ASR Systems Under Acoustic Challenges: A Multilingual Study

Analyzing Multilingual Automatic Speech Recognition Systems Performance

Notes

References

Amazon Transcribe: Speech to Text - AWS. https://aws.amazon.com/transcribe. Accessed 14 Mar 2022
Bates, D., Mächler, M., Bolker, B., Walker, S.: Fitting linear mixed-effects models using LME4. J. Stat. Softw. 67(1), 1–48 (2015). https://doi.org/10.18637/jss.v067.i01
Article Google Scholar
Diwan, A., et al.: Multilingual and code-switching ASR challenges for low resource Indian languages. In: Proceedings of Interspeech (2021)
Google Scholar
Google Cloud: Speech-to-Text: Automatic Speech Recognition. https://cloud.google.com/speech-to-text. Accessed 04 Mar 2022
IBM Watson: Watson Speech to Text. https://www.ibm.com/cloud/watson-speech-to-text. Accessed 09 Mar 2022
Mennen, I.: Phonological and phonetic influences in non-native intonation. Trends Linguist. Stud. Monogr. 186, 53 (2007)
Google Scholar
Microsoft Azure: Speech-to-Text. https://azure.microsoft.com/en-us/services/cognitive-services/speech-to-text. Accessed 01 Mar 2022
Mishra, S., Mishra, A.: Linguistic Interference from Hindi in Indian English. Int. J. Stud. Engl. Langu. Lit. 4(1), 29–38 (2016)
Google Scholar
Office of the Registrar & General Census Commissioner, India: C-17 population by bilingualism and trilingualism (2011). https://censusindia.gov.in/2011census/C-17.html. Accessed 24 Mar 2022
Office of the Registrar General & Census Commissioner, India: Family-wise grouping of scheduled and non-scheduled languages (2011). https://censusindia.gov.in/2011Census/Language-2011/Statement-9.pdf. Accessed 24 Mar 2022
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2019). https://www.R-project.org/
Rasier, L., Hiligsmann, P.: Prosodic transfer from L1 to L2. Theoretical and methodological issues. Nouveaux cahiers de linguistique française (New Noteb. French Linguist.) 28, 41–66 (2007)
Google Scholar
Seide, F., Li, G., Yu, D.: Conversational speech transcription using context-dependent deep neural networks. In: Proceedings of Twelfth Annual Conference of the International Speech Communication Association (2011)
Google Scholar
Tomokiyo, L.M.: Lexical and acoustic modeling of non-native speech in LVSCR. In: Proceedings of Sixth International Conference on Spoken Language Processing (2000)
Google Scholar
Vu, N.T., Wang, Y., Klose, M., Mihaylova, Z., Schultz, T.: Improving ASR performance on non-native speech using multilingual and crosslingual information. In: Proceedings of Fifteenth Annual Conference of the International Speech Communication Association (2014)
Google Scholar
Wiltshire, C., Sarmah, P.: Voicing contrasts in the stops of Indian English produced by Assamese speakers. Proc. Meet. Acoust. 42(1), 060003 (2020)
Google Scholar
Wiltshire, C.R.: The “Indian English’’ of Tibeto-Burman language speakers. Engl. World Wide 26(3), 275–300 (2005)
Article Google Scholar
Wiltshire, C.R.: Uniformity and Variability in the Indian English Accent. Elements in World Englishes. Cambridge University Press (2020). https://doi.org/10.1017/9781108913768

Download references

Author information

Authors and Affiliations

Centre for Linguistic Science and Technology, Indian Institute of Technology Guwahati, Guwahati, 781039, India
Joyshree Chakraborty, Rohit Sinha & Priyankoo Sarmah

Authors

Joyshree Chakraborty
View author publications
You can also search for this author in PubMed Google Scholar
Rohit Sinha
View author publications
You can also search for this author in PubMed Google Scholar
Priyankoo Sarmah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joyshree Chakraborty .

Editor information

Editors and Affiliations

Indian Institute of Technology Dharwad, Dharwad, India
S. R. Mahadeva Prasanna
St. Petersburg Federal Research Center of the Russian Academy of Sciences, St. Petersburg, Russia
Alexey Karpov
Koneru Lakshmaiah Education Foundation, Vaddeswaram, India
K. Samudravijaya
KIIT Group of Colleges, Gurugram, India
Shyam S. Agrawal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chakraborty, J., Sinha, R., Sarmah, P. (2022). Influence of Accented Speech in Automatic Speech Recognition: A Case Study on Assamese L1 Speakers Speaking Code Switched Hindi-English. In: Prasanna, S.R.M., Karpov, A., Samudravijaya, K., Agrawal, S.S. (eds) Speech and Computer. SPECOM 2022. Lecture Notes in Computer Science(), vol 13721. Springer, Cham. https://doi.org/10.1007/978-3-031-20980-2_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-20980-2_9
Published: 10 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20979-6
Online ISBN: 978-3-031-20980-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Influence of Accented Speech in Automatic Speech Recognition: A Case Study on Assamese L1 Speakers Speaking Code Switched Hindi-English