Assessment of Speech Quality During Speech Rehabilitation Based on the Solution of the Classification Problem

Kostyuchenko, Evgeny; Rakhmanenko, Ivan; Balatskaya, Lidiya

doi:10.1007/978-3-031-20980-2_33

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13721))

Included in the following conference series:

International Conference on Speech and Computer

788 Accesses

Abstract

The article considers an approach to the problem of assessing the quality of speech during speech rehabilitation as a classification problem. For this, a classifier is built on the basis of an LSTM neural network for dividing speech signals into two classes: before the operation and immediately after. At the same time, speech before the operation is the standard to which it is necessary to approach in the process of rehabilitation. The metric of belonging of the evaluated signal to the reference class acts as an assessment of speech. An experimental assessment of rehabilitation sessions and a comparison of the resulting assessments with expert assessments of phrasal intelligibility were carried out.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Kaprin, A., Starinskiy, A., Petrova, G.: Malignant neoplasm in Russia in 2019 (morbidity and mortality). P. A. Hertsen Moscow Oncology Research Center - Branch of FSBI NMRRCof the Ministry of Helth of Russia, Moscow (2020)
Google Scholar
Standard GOST R 50840-95: Voice over paths of communication. Methods for assessing the quality, legibility and recognition. Publishing Standards, Moscow, 234 p. (1995)
Google Scholar
Balatskaya, L.N., Choinzonov, E.L., Chizevskaya, S.Y., Kostyuchenko, E.U., Meshcheryakov, R.V.: Software for assessing voice quality in rehabilitation of patients after surgical treatment of cancer of oral cavity, oropharynx and upper jaw. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 294–301. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-01931-4_39
Kostyuchenko, E., Meshcheryakov, R., Ignatieva, D., Pyatkov, A., Choynzonov, E., Balatskaya, L.: Correlation normalization of syllables and comparative evaluation of pronunciation quality in speech rehabilitation. In: Karpov, A., Potapova, R., Mporas, I. (eds.) SPECOM 2017, pp. 262–271. LNCS, vol. 10458. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66429-3_25
Meschryakov, R.V., et al.: Speech quality measurement automation for patients with cancer of the oral cavity and oropharynx. In: 2016 International Siberian Conference on Control and Communications (SIBCON), pp. 1–5. IEEE, May 2016
Google Scholar
Nikolaev, A.N.: Mathematical models and a set of programs for automatic assessment of the quality of a speech signal. The dissertation for the degree of candidate of technical sciences, specialty 05.13.18 - Mathematical modeling, numerical methods and program complexes, Ekaterinburg (2002)
Google Scholar
Kostuchenko, E., et al.: The evaluation process automation of phrase and word intelligibility using speech recognition systems. In: Salah, A., Karpov, A., Potapova, R. (eds.) SPECOM 2019. LNCS, vol. 11658, pp. 237–246. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26061-3_25
Rippel, O., Snoek, J., Adams, R.P.: Spectral representations for convolutional neural networks. arXiv preprint arXiv:1506.03767 (2015)
Kipyatkova, I.S., Karpov, A.A.: Variants of deep artificial neural networks for speech recognition systems. Trudy SPIIRAN 49, 80–103 (2016)
Google Scholar
Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE (2013)
Google Scholar
Lim, C.P., Woo, S.C., Loh, A.S., Osman, R.: Speech recognition using artificial neural networks. In: Proceedings of the First International Conference on Web Information Systems Engineering, vol. 1, pp. 419–423. IEEE, June 2000
Google Scholar
Shukla, A., Tiwari, R.: A novel approach of speaker authentication by fusion of speech and image features using Artificial Neural Networks. Int. J. Inf. Commun. Technol. 1(2), 159–170 (2008)
Google Scholar
Kaya, H., Karpov, A.A.: Efficient and effective strategies for cross-corpus acoustic emotion recognition. Neurocomputing 275, 1028–1034 (2018)
Article Google Scholar
Graves, A., Jaitly, N., Mohamed, A.R.: Hybrid speech recognition with deep bidirectional LSTM. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 273–278. IEEE, December 2013
Google Scholar
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Comput. 12(10), 2451–2471 (2000)
Article Google Scholar
Irkutsk Supercomputer Center SB RAS. http://hpc.icc.ru/en/. Accessed 15 July 2022

Download references

Acknowledgements

This research was funded by the Ministry of Science and Higher Education of the Russian Federation within the framework of scientific projects carried out by teams of research laboratories of educational institutions of higher education subordinate to the Ministry of Science and Higher Education of the Russian Federation, project number FEWM-2020-0042. The authors would like to thank the Irkutsk Supercomputer Center of SB RAS for providing access to the HPC-cluster «Akademik V.M. Matrosov» [16].

Author information

Authors and Affiliations

Tomsk State University of Control Systems and Radioelectronics, Lenina Str. 40, 634050, Tomsk, Russia
Evgeny Kostyuchenko, Ivan Rakhmanenko & Lidiya Balatskaya
Tomsk Cancer Research Institute, Kooperativniy Av. 5, 634050, Tomsk, Russia
Lidiya Balatskaya

Authors

Evgeny Kostyuchenko
View author publications
You can also search for this author in PubMed Google Scholar
Ivan Rakhmanenko
View author publications
You can also search for this author in PubMed Google Scholar
Lidiya Balatskaya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Evgeny Kostyuchenko .

Editor information

Editors and Affiliations

Indian Institute of Technology Dharwad, Dharwad, India
S. R. Mahadeva Prasanna
St. Petersburg Federal Research Center of the Russian Academy of Sciences, St. Petersburg, Russia
Alexey Karpov
Koneru Lakshmaiah Education Foundation, Vaddeswaram, India
K. Samudravijaya
KIIT Group of Colleges, Gurugram, India
Shyam S. Agrawal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kostyuchenko, E., Rakhmanenko, I., Balatskaya, L. (2022). Assessment of Speech Quality During Speech Rehabilitation Based on the Solution of the Classification Problem. In: Prasanna, S.R.M., Karpov, A., Samudravijaya, K., Agrawal, S.S. (eds) Speech and Computer. SPECOM 2022. Lecture Notes in Computer Science(), vol 13721. Springer, Cham. https://doi.org/10.1007/978-3-031-20980-2_33

Download citation

DOI: https://doi.org/10.1007/978-3-031-20980-2_33
Published: 10 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20979-6
Online ISBN: 978-3-031-20980-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics