Abstract
The article proposes analysis and comparison of the estimates of syllable intelligibility obtained from speech therapist (expert) and using an automatic speech quality assessment algorithm. The relevance of the development of algorithms for automatic assessment of speech quality and syllabic intelligibility is shown. The estimates were obtained based on the analysis of voice recordings of real patients after surgical treatment of oncological diseases of the oral cavity and oropharynx. For comparison with expert opinion, estimates are proposed that were obtained using dynamic time warping (DTW) for time normalization and three metrics: DTW distance, correlation coefficient and Minkowski distance. The obtained quantitative estimates were converted to a binary form using optimization methods for comparison with expert estimates, which are initially binary. Errors between expert estimates and converted quantitative estimates are calculated for each patient individually and in general. Of the listed metrics, the DTW distance was chosen for further use, this metric allows to get estimates that are most consistent with the expert opinion. The task of selecting a combination of metrics for further research is proposed, its limitations are indicated.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kaprin, A.D., Starinskiy, V.V., Shahzadova, A.O.: Malignancies in Russia in 2019 (Morbidity and Mortality), p. 252. MNIOI name of P.A. Herzen, Moscow (2020)
Method for squamous cell cancer of oral cavity and throat treatment. Boyko, A.V., Gevorkov, A.R., Plavnik, R.N., Bagova, S.Z., Khmelevsky, E.V., Kaprin, A.D.: Patent for invention RU 2715550 C2, 28.02.2020.
Oral cavity cancer treatment method. Kaprin, A.D., Ivanov, A.A., Sevryukov, F.E., Panaseikin, Yu.A., and others.: Patent for invention RU 2713530 C2, 05.02.2020.
Kulakov, A.A., et al.: Phonation and speech recovery in cancer patients with maxillary defects. Head and neck tumors 1(2012), 55–60 (2012)
Choynzonov, E.L., Balatskaya, L.N., Dubskiy, S.V.: The Quality of Life of Cancer Patients. Printing manufactory, Tomsk (2011)
Standard GOST R 50840-95 Voice over paths of communication. Methods for assessing the quality, legibility and recognition, p. 234. Publishing Standards, Moscow (1995)
Novokhrestova, D., Kostyuchenko, E., Meshcheryakov, R.: Choice of signal short-term energy parameter for assessing speech intelligibility in the process of speech rehabilitation. In: Karpov, A., Jokisch, O., Potapova, R. (eds.) SPECOM 2018. LNCS (LNAI), vol. 11096, pp. 461–469. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99579-3_48
Kostyuchenko, E., Meshcheryakov, R., Ignatieva, D., Pyatkov, A., Choynzonov, E., Balatskaya, L.: Correlation normalization of syllables and comparative evaluation of pronunciation quality in speech rehabilitation. In: Karpov, A., Potapova, R., Mporas, I. (eds.) SPECOM 2017. LNCS (LNAI), vol. 10458, pp. 262–271. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66429-3_25
Kostyuchenko, E., Roman, M., Ignatieva, D., Pyatkov, A., Choynzonov, E., Balatskaya, L.: Evaluation of the speech quality during rehabilitation after surgical treatment of the cancer of oral cavity and oropharynx based on a comparison of the fourier spectra. In: Ronzhin, A., Potapova, R., Németh, G. (eds.) SPECOM 2016. LNCS (LNAI), vol. 9811, pp. 287–295. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-43958-7_34
Alimuradov, A.K., Tychkov, A.Y.: Application of the method empirical mode decomposition for the study of voiced speech in the problem of detecting human stress emotions. PNRPU Bulletin. Electrotechnics, Informational Technologies, Control Systems 35, 7–29 (2020)
Markitantov, M., Verkholyak, O.: Automatic recognition of speaker age and gender based on deep neural networks. In: Salah, A.A., Karpov, A., Potapova, R. (eds.) SPECOM 2019. LNCS (LNAI), vol. 11658, pp. 327–336. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26061-3_34
Lyakso, E.E., et al.: Voice portrait of a child with typical and atypical development. Publishing and Printing Association of Higher Educational Institutions, Saint Petersburg (2020)
Kostuchenko, E., et al.: The evaluation process automation of phrase and word intelligibility using speech recognition systems. In: Salah, A.A., Karpov, A., Potapova, R. (eds.) SPECOM 2019. LNCS (LNAI), vol. 11658, pp. 237–246. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26061-3_25
Translator of Russian words in phonetic transcription. https://easypronunciation.com/ru/russian-phonetic-transcription-converter#phonetic_transcription. Accessed 14 June 2021
Acknowledgments
The reported study was funded by RFBR, project number 20-37-90082.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Novokhrestova, D., Kostuchenko, E., Hodashinsky, I., Balatskaya, L. (2021). Experimental Analysis of Expert and Quantitative Estimates of Syllable Recordings in the Process of Speech Rehabilitation. In: Karpov, A., Potapova, R. (eds) Speech and Computer. SPECOM 2021. Lecture Notes in Computer Science(), vol 12997. Springer, Cham. https://doi.org/10.1007/978-3-030-87802-3_44
Download citation
DOI: https://doi.org/10.1007/978-3-030-87802-3_44
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87801-6
Online ISBN: 978-3-030-87802-3
eBook Packages: Computer ScienceComputer Science (R0)