Abstract
Recently, modern people experience trauma symptom for various reasons. Trauma causes emotional control problems and anxiety. Although a psychiatric diagnosis is essential, people are reluctant to visit hospitals. In this paper, we propose a method for screening trauma based on voice audio data using convolutional neural networks. Among the six basic emotions, four emotions were used for screening trauma: fear, sad, happy, and neutral. The first pre-processing of adjusting the length of the audio data in units of 2 s and augmenting the number of data, and the second pre-processing is performed in order to convert voice temporal signal into a spectrogram image by short-time Fourier transform. The spectrogram images are trained through the four convolution neural networks. As a result, VGG-13 model showed the highest performance (98.96%) for screening trauma among others. A decision-level fusion strategy as a post-processing is adopted to determine the final traumatic state by confirming the maintenance of the same continuous state for the traumatic state estimated by the trained VGG-13 model. As a result, it was confirmed that high-accuracy voice-based trauma diagnosis is possible according to the setting value for continuous state observation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lee, B.S.: Nature and ethics inherent in the trauma of division. J. Epoch Philos. 22(1), 153–183 (2011)
Ahn, H.N.: Recent trend of trauma treatment. J. Korean Psychol. Assoc. 2014(1), 162 (2014)
So, S.W., et al.: Development of age classification deep learning algorithm using Korean speech. J. Biomed. Eng. Res. 39(2), 63–68 (2018)
Choee, H.W., Park, S.M., Sim, K.B.: CNN-based speech emotion recognition using transfer learning. Int. J. Korean Inst. Intell. Syst. 29(5), 339–344 (2019)
Ekman, P.: An argument for basic emotions. Cogn. Emot. 6(3–4), 169–200 (1992)
Amstadter, A.B., Vernon, L.L.: Emotional reactions during and after trauma: a comparison of trauma types. J. Aggress. Maltreatment Trauma 16(4), 391–408 (2008)
Center for Substance Abuse Treatment: In Trauma-Informed Care in Behavioral Health Services. Substance Abuse and Mental Health Services Administration, US (2014)
Kakao enterprise. https://tech.kakaoenterprise.com/66. Accessed 26 Oct 2020
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Park, D.S., Bang, J.I., Kim, H.J., Ko, Y.J.: A study on the gender and age classification of speech data using CNN. J. Korean Inst. Inf. Technol. 16(11), 11–21 (2018)
Acknowledgement
This work was supported by the Industrial Strategic Technology Development Program (No. 10073159) funded by the Ministry of Trade, Industry & Energy (MI, Korea).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Kim, N.H., Kim, S.E., Mok, J.W., Yu, S.G., Han, N.Y., Lee, E.C. (2021). Screening Trauma Through CNN-Based Voice Emotion Classification. In: Singh, M., Kang, DK., Lee, JH., Tiwary, U.S., Singh, D., Chung, WY. (eds) Intelligent Human Computer Interaction. IHCI 2020. Lecture Notes in Computer Science(), vol 12615. Springer, Cham. https://doi.org/10.1007/978-3-030-68449-5_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-68449-5_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68448-8
Online ISBN: 978-3-030-68449-5
eBook Packages: Computer ScienceComputer Science (R0)