Abstract
The article considers an approach to the problem of assessing the quality of speech during speech rehabilitation as a classification problem. For this, a classifier is built on the basis of an LSTM neural network for dividing speech signals into two classes: before the operation and immediately after. At the same time, speech before the operation is the standard to which it is necessary to approach in the process of rehabilitation. The metric of belonging of the evaluated signal to the reference class acts as an assessment of speech. An experimental assessment of rehabilitation sessions and a comparison of the resulting assessments with expert assessments of phrasal intelligibility were carried out.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kaprin, A., Starinskiy, A., Petrova, G.: Malignant neoplasm in Russia in 2019 (morbidity and mortality). P. A. Hertsen Moscow Oncology Research Center - Branch of FSBI NMRRCof the Ministry of Helth of Russia, Moscow (2020)
Standard GOST R 50840-95: Voice over paths of communication. Methods for assessing the quality, legibility and recognition. Publishing Standards, Moscow, 234 p. (1995)
Balatskaya, L.N., Choinzonov, E.L., Chizevskaya, S.Y., Kostyuchenko, E.U., Meshcheryakov, R.V.: Software for assessing voice quality in rehabilitation of patients after surgical treatment of cancer of oral cavity, oropharynx and upper jaw. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 294–301. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-01931-4_39
Kostyuchenko, E., Meshcheryakov, R., Ignatieva, D., Pyatkov, A., Choynzonov, E., Balatskaya, L.: Correlation normalization of syllables and comparative evaluation of pronunciation quality in speech rehabilitation. In: Karpov, A., Potapova, R., Mporas, I. (eds.) SPECOM 2017, pp. 262–271. LNCS, vol. 10458. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66429-3_25
Meschryakov, R.V., et al.: Speech quality measurement automation for patients with cancer of the oral cavity and oropharynx. In: 2016 International Siberian Conference on Control and Communications (SIBCON), pp. 1–5. IEEE, May 2016
Nikolaev, A.N.: Mathematical models and a set of programs for automatic assessment of the quality of a speech signal. The dissertation for the degree of candidate of technical sciences, specialty 05.13.18 - Mathematical modeling, numerical methods and program complexes, Ekaterinburg (2002)
Kostuchenko, E., et al.: The evaluation process automation of phrase and word intelligibility using speech recognition systems. In: Salah, A., Karpov, A., Potapova, R. (eds.) SPECOM 2019. LNCS, vol. 11658, pp. 237–246. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26061-3_25
Rippel, O., Snoek, J., Adams, R.P.: Spectral representations for convolutional neural networks. arXiv preprint arXiv:1506.03767 (2015)
Kipyatkova, I.S., Karpov, A.A.: Variants of deep artificial neural networks for speech recognition systems. Trudy SPIIRAN 49, 80–103 (2016)
Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE (2013)
Lim, C.P., Woo, S.C., Loh, A.S., Osman, R.: Speech recognition using artificial neural networks. In: Proceedings of the First International Conference on Web Information Systems Engineering, vol. 1, pp. 419–423. IEEE, June 2000
Shukla, A., Tiwari, R.: A novel approach of speaker authentication by fusion of speech and image features using Artificial Neural Networks. Int. J. Inf. Commun. Technol. 1(2), 159–170 (2008)
Kaya, H., Karpov, A.A.: Efficient and effective strategies for cross-corpus acoustic emotion recognition. Neurocomputing 275, 1028–1034 (2018)
Graves, A., Jaitly, N., Mohamed, A.R.: Hybrid speech recognition with deep bidirectional LSTM. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 273–278. IEEE, December 2013
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Comput. 12(10), 2451–2471 (2000)
Irkutsk Supercomputer Center SB RAS. http://hpc.icc.ru/en/. Accessed 15 July 2022
Acknowledgements
This research was funded by the Ministry of Science and Higher Education of the Russian Federation within the framework of scientific projects carried out by teams of research laboratories of educational institutions of higher education subordinate to the Ministry of Science and Higher Education of the Russian Federation, project number FEWM-2020-0042. The authors would like to thank the Irkutsk Supercomputer Center of SB RAS for providing access to the HPC-cluster «Akademik V.M. Matrosov» [16].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Kostyuchenko, E., Rakhmanenko, I., Balatskaya, L. (2022). Assessment of Speech Quality During Speech Rehabilitation Based on the Solution of the Classification Problem. In: Prasanna, S.R.M., Karpov, A., Samudravijaya, K., Agrawal, S.S. (eds) Speech and Computer. SPECOM 2022. Lecture Notes in Computer Science(), vol 13721. Springer, Cham. https://doi.org/10.1007/978-3-031-20980-2_33
Download citation
DOI: https://doi.org/10.1007/978-3-031-20980-2_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20979-6
Online ISBN: 978-3-031-20980-2
eBook Packages: Computer ScienceComputer Science (R0)