Abstract
The objective of the work presented in the paper is to check the significance of duration modification for improving the speech intelligibility of the patients having slurred speech disorder due to traumatic brain injury (TBI). A slow speaking rate has been observed in the speech utterances of a patient having diffuse axonal injury, a type of TBI. To compensate the slow speaking rate, the utterances are subjected to duration modification for various scaling factors. Subjective listening tests are then conducted for assessing the effort required to understand the spoken utterances among a group of medical and non-medical listeners. The improved mean opinion scores (MOS) confirmed that the duration modification is indeed reduce the listening effort while perceiving the slurred speech utterances. From the listening tests, a speaker dependent duration modification factor of 0.75 has provided the best enhancement of the slurred speech with improved intelligibility.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Adank, P., McGettigan, C., Kotz, S.A.E.: The Cognitive and Neural Organisation of Speech Processing. Frontiers Media, Lausanne (2016)
Celin, T.A.M., Vijayalakshmi, P., Nagarajan, T.: Data augmentation techniques for transfer learning-based continuous dysarthric speech recognition. Circuits Syst. Sig. Process. 42, 601–623 (2023)
Dowling, G.A.: Levels of cognitive fnctioning: evaluation of interrater reliability. J. Neuro Surg. Nurs. 17(2), 129–134 (1985)
Drugman, T., Thomas, M., Gudnason, J., Naylor, P., Dutoit, T.: Detection of glottal closure instants from speech signals: a quantitative review. IEEE Trans. Audio Speech Lang. Process. 20, 994–1006 (2012)
Gale, R., Chen, L., Dolata, J., van Santen, J., Asgari, M.: Improving ASR systems for children with autism and language impairment using domain focused DNN transfer techniques. In: Proceedings Interspeech (2019)
Govind, D., Prasanna, S.R.M., Yegnanarayana, B.: Neutral to target emotion conversion using source and suprasegmental information. In: Proceedings Interspeech 2011, August 2011
Hartmann, A., Kegelmeyer, D., Kloos, A.: Use of an errorless learning approach in a person with concomitant traumatic spinal cord injury and brain injury: a case report. J. Neurol. Phys. Ther. 42(2), 102–109 (2018)
Kathania, H.K., Kadiri, S.R., Alku, P., Kurimo, M.: A formant modification method for improved ASR for children speech. Speech Commun. 136, 98–106 (2022)
Krishnamoorthy, P., Prasanna, S.R.M.: Reverberant speech enhancement by temporal and spectral processing. IEEE Trans. Audio Speech Lang. Process. 17(2), 253–266 (2009)
Krishnamoorthy, P., Prasanna, S.R.M.: Enhancement of noisy speech by temporal and spectral processing. Speech Commun. 53(2), 154–174 (2011)
MacDonald, R.L., et al.: Disordered speech data collection: lessons learned at 1 million utterances from project euphonia. In: Proceedings Interspeech (2021)
Mesfin, F., Gupta, N., Hays, A.S., et al.: Diffuse Axonal Injury. Treasure Island (FL). StatPearls Publishing (2022). https://www.ncbi.nlm.nih.gov/books/NBK448102
Mitchell, C., Bowen, A., Tyson, S., Butterfint, Z., Conroy, P.: Interventions for dysarthria due to stroke and other adult-acquired, non-progressive brain injury. Cochrane Database Syst. Rev. 25(1) (2017)
Moulines, E., Charpentier, F.: Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones. Speech Commun. 9, 452–467 (1990)
Murty, K.S.R., Yegnanarayana, B.: Epoch extraction from speech signals. IEEE Trans. Audio Speech Lang. Process. 16(8), 1602–1614 (2008)
Nasreddine, Z.S., et al.: The montreal cognitive assessment, MoCa: a brief screening tool for mild cognitive impairment. J. Am. Geriatr. Soc. 63(4), 695–704 (2005)
Naylor, P.A., Kounoudes, A., Gudnason, J., Brookes, M.: Estimation of glottal closure instants in voiced speech using DYPSA algorithm. IEEE Trans. Audio Speech Lang. Process. 15(1), 34–43 (2007)
Nicolas-Alonso, L.F., Gomez-Gil, J.: Brain computer interfaces- a review. Sensors 12(2), 1211–1279 (2012)
Prasanna, S.R.M., Govind, D., Rao, K.S., Yenanarayana, B.: Fast prosody modification using instants of significant excitation. In: Proceedings Speech Prosody, May 2010
Prasanna, S.R.M., Yegnanarayana, B.: Extraction of pitch in adverse conditions. In: Proceedings ICASSP, Montreal, Canada, May 2004
Quatieri, T.F., McAulay, R.J.: Shape invariant time scale and pitch modification of speech. IEEE Trans. Sig. Process. 40(3), 497–510 (1992)
Raman, S., Serrano, L., Winneke, A., Navas, E., Hernaez, I.: Intelligibility and listening effort of Spanish oesophageal speech. Appl. Sci. 9(16), 3233 (2019)
Rao, K.S., Yegnanarayana, B.: Prosody modification using instants of significant excitation. IEEE Trans. Audio Speech Lang. Process. 14, 972–980 (2006)
Rao, K.S., Yegananarayana, B.: Duration modification using glottal closure instants and vowel onset points. Speech Commun. 51(12), 1263–1269 (2009)
Row, H.P., Gutz, S.E., Maffei, M.F., Green, K.T.J.R.: Characterizing dysarthria diversity for automatic speech recognition: a tutorial from the clinical perspective. Frontiers Comput. Sci. 19 (2022)
Rudzicz, F.: Acoustic transformations to improve the intelligibility of dysarthric speech. In: Proceedings Second Workshop on Speech and Language Processing for Assistive Technologies (2011)
Schultz, T., Wand, M., Hueber, T., Krsienski, D.J., Herff, C., Brumberg, J.S.: Biosignal-based spoken communication: a survey. IEEE Trans. Audio Speech Lang. Process. (2015)
Shor, J., et al.: Personalizing ASR for dysarthric and accented speech with limited data. In: Proceedings Interspeech, pp. 784–788 (2019)
Tremblay, P., Dick, A.S.: Broca and Wernicke are dead or moving past the classic model of language neurobiology. Brain Lang. 162, 60–71 (2016)
Acknowledgements
Authors would like to convey our sincere gratitude towards all the people who participated in the listening test. The paper would not have been possible without the time spent by the doctors of All India Institute of Medical Sciences (AIIMS) Mangalagiri who have prior experience interacting with stroke and TBI patients. Further, authors would like to appreciate the hospital management of Kumar center for stroke and neuro rehabilitation for helping us to collection the data and providing the ethical clearance for using the data for the academic research.
The funding for this paper is from the National Language Translation Mission (NLTM) sub consortium of the project titled “Speech Technologies in Indian Languages”, MEITY, Govt. of India.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Banerjee, O. et al. (2023). Significance of Duration Modification in Reducing Listening Effort of Slurred Speech from Patients with Traumatic Brain Injury. In: Karpov, A., Samudravijaya, K., Deepak, K.T., Hegde, R.M., Agrawal, S.S., Prasanna, S.R.M. (eds) Speech and Computer. SPECOM 2023. Lecture Notes in Computer Science(), vol 14338. Springer, Cham. https://doi.org/10.1007/978-3-031-48309-7_47
Download citation
DOI: https://doi.org/10.1007/978-3-031-48309-7_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48308-0
Online ISBN: 978-3-031-48309-7
eBook Packages: Computer ScienceComputer Science (R0)