Skip to main content
Log in

A deep transfer learning approach for improved post-traumatic stress disorder diagnosis

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Post-traumatic stress disorder (PTSD) is a traumatic-stressor-related disorder developed by exposure to a traumatic or adverse environmental event that caused serious harm or injury. Structured interview is the only widely accepted clinical practice for PTSD diagnosis but suffers from several limitations including the stigma associated with the disease. Diagnosis of PTSD patients by analyzing speech signals has been investigated as an alternative since recent years, where speech signals are processed to extract frequency features and these features are then fed into a classification model for PTSD diagnosis. In this paper, we developed a deep belief network (DBN) model combined with a transfer learning (TL) strategy for PTSD diagnosis. We computed three categories of speech features and utilized the DBN model to fuse these features. The TL strategy was utilized to transfer knowledge learned from a large speech recognition database, TIMIT, for PTSD detection where PTSD patient data are difficult to collect. We evaluated the proposed methods on two PTSD speech databases, each of which consists of audio recordings from 26 patients. We compared the proposed methods with other popular methods and showed that the state-of-the-art support vector machine (SVM) classifier only achieved an accuracy of 57.68%, and TL strategy boosted the performance of the DBN from 61.53 to 74.99%. Altogether, our method provides a pragmatic and promising tool for PTSD diagnosis. Preliminary results of this study were presented in Banerjee (in: 2017 IEEE international conference on data mining (ICDM), IEEE, 2017).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Banerjee D, Islam K, Mei G, Xiao L, Zhang G, Xu R, Ji S, Li J (2017) A deep transfer learning approach for improved post-traumatic stress disorder diagnosis. In: 2017 IEEE international conference on data mining (ICDM), IEEE, pp 11–20

  2. Bengio Y (2009) Learning deep architectures for AI. Found Trends® Mach Learn 2(1):1–127

    Article  MATH  MathSciNet  Google Scholar 

  3. Bijleveld H-A (2015) Post-traumatic stress disorder and stuttering: a diagnostic challenge in a case study. Proc Soc Behav Sci 193:37–43

    Article  Google Scholar 

  4. Brown SM, Webb A, Mangoubi R, Dy JG (2015) A sparse combined regression-classification formulation for learning a physiological alternative to clinical post-traumatic stress disorder scores. In: AAAI, pp 1700–1706

  5. Calvo RA, D’Mello S (2010) Affect detection: an interdisciplinary review of models, methods, and their applications. IEEE Trans Affect Comput 1(1):18–37

    Article  Google Scholar 

  6. Deng L, Li J, Huang J-T, Yao K, Yu D, Seide F, Seltzer M, Zweig G, He X, Williams J, et al (2013) Recent advances in deep learning for speech research at Microsoft. In: 2013 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 8604–8608

  7. Dieleman S, Schrauwen B (2014) End-to-end learning for music audio. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 6964–6968

  8. Edwards AL (1948) Note on the correction for continuity in testing the significance of the difference between correlated proportions. Psychometrika 13(3):185–187

    Article  Google Scholar 

  9. Farrús M, Hernando J, Ejarque P (2007) Jitter and shimmer measurements for speaker recognition. In: Eighth annual conference of the international speech communication association

  10. Foa EB, Steketee G, Rothbaum BO (1989) Behavioral/cognitive conceptualizations of post-traumatic stress disorder. Behav Ther 20(2):155–176

    Article  Google Scholar 

  11. Friedman MJ (2007) PTSD history and overview. United States Department of Veterans Affairs

  12. Galatzer-Levy IR, Ma S, Statnikov A, Yehuda R, Shalev AY (2017) Utilization of machine learning for prediction of post-traumatic stress: a re-examination of cortisol in the prediction and pathways to non-remitting ptsd. Transl Psychiatr 7(3):e1070

    Article  Google Scholar 

  13. Galatzer-Levy IR, Karstoft KI, Statnikov A, Shalev AY (2014) Quantitative forecasting of ptsd from early trauma responses: a machine learning application. J Psychiatr Res 59:68–76

    Article  Google Scholar 

  14. Garofolo John S, Lamel Lori F, Fisher William M, Fiscus Jonathan G, Pallett David S, Dahlgren Nancy L, Victor Z (1993) TIMIT acoustic-phonetic continuous speech corpus, 1993. Linguistic Data Consortium, Philadelphia

    Google Scholar 

  15. Grinage BD (2003) Diagnosis and management of post-traumatic stress disorder. Am Fam Phys 68(12):2401–2408

    Google Scholar 

  16. Gulzar T, Singh A, Sharma S (2014) Comparative analysis of IPCC, MFCC and BFCC for the recognition of Hindi words using artificial neural networks. Int J Comput Appl 101(12):22–27

    Google Scholar 

  17. How common is ptsd (2018) https://www.ptsd.va.gov/public/ptsd-overview/basics/how-common-is-ptsd.asp. Accessed 20 June 2018

  18. Hansen JHL, Kim W, Rahurkar M, Ruzanski E, Meyerhoff J (2011) Robust emotional stressed speech detection using weighted frequency subbands. EURASIP J Adv Signal Process 2011(1):906789

    Article  Google Scholar 

  19. Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554

    Article  MathSciNet  MATH  Google Scholar 

  20. Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507

    Article  MathSciNet  MATH  Google Scholar 

  21. Hovens JE, Van der Ploeg HM, Klaarenbeek MTA, Bramsen I, Schreuder JN, Rivero VV (1994) The assessment of posttraumatic stress disorder: with the clinician administered ptsd scale: Dutch results. J Clin Psychol 50(3):325–340

    Article  Google Scholar 

  22. Kamishima T, Hamasaki M, Akaho S (2009) Trbagg: a simple transfer learning method and its application to personalization in collaborative tagging. In: Ninth IEEE international conference on data mining, 2009, ICDM’09, IEEE, pp 219–228

  23. Karen-Inge K, Galatzer-Levy Isaac R, Alexander S, Zhiguo L, Shalev Arieh Y (2015) Bridging a translational gap: using machine learning to improve the prediction of ptsd. BMC Psychiatr 15(1):30

    Article  Google Scholar 

  24. Kessler RC, Rose S, Koenen KC, Karam EG, Stang PE, Stein DJ, Heeringa SG, Hill ED, Liberzon I, McLaughlin KA (2014) How well can post-traumatic stress disorder be predicted from pre-trauma risk factors? An exploratory study in the who world mental health surveys. World Psychiatr 13(3):265–274

    Article  Google Scholar 

  25. Kim J-H, Woodland PC (2001) The use of prosody in a combined system for punctuation generation and speech recognition. In: Seventh European conference on speech communication and technology

  26. Knoth B, Vergyri D, Shriberg E, Mitra V, Mclaren V, Kathol A, Richey C, Graciarena M (2018) Systems for speech-based assessment of a patient’s state-of-mind. US Patent WO2016028495 A1

  27. Krothapalli SR, Koolagudi SG (2013) Characterization and recognition of emotions from speech using excitation source information. Int J Speech Technol 16(2):181–201

    Article  Google Scholar 

  28. Kumaraswamy R, Odom P, Kersting K, Leake D, Natarajan S (2015) Transfer learning via relational type matching. In: 2015 IEEE international conference on data mining (ICDM), IEEE, pp 811–816

  29. Kunze J, Kirsch L, Kurenkov I, Krug A, Johannsmeier J, Stober S (2017) Transfer learning for speech recognition on a budget. ArXiv preprint arXiv:1706.00290

  30. Li X, Tao J, Johnson MT, Soltis J, Savage A, Leong KM, Newman JD (2007) Stress and emotion classification using jitter and shimmer features. In: IEEE international conference on acoustics, speech and signal processing, 2007, ICASSP 2007, vol 4. IEEE, pp IV–1081

  31. Litman DJ, Hirschberg JB, Swerts M (2000) Predicting automatic speech recognition performance using prosodic cues. In: Proceedings of the 1st North American chapter of the association for computational linguistics conference. Association for Computational Linguistics, pp 218–225

  32. Marinić I, Supek F, Kovačić Z, Rukavina L, Jendričko T, Kozarić-Kovačić D (2007) Posttraumatic stress disorder: diagnostic data analysis by data mining methodology. Croat Med J 48(2):185–197

    Google Scholar 

  33. Muda L, Begam M, Elamvazuthi I (2010) Voice recognition algorithms using mel frequency cepstral coefficient (mfcc) and dynamic time warping (dtw) techniques. ArXiv preprint arXiv:1003.4083

  34. Omurca S, Ekinci E (2015) An alternative evaluation of post traumatic stress disorder with machine learning methods. In: 2015 International symposium on innovations in intelligent systems and applications (INISTA), IEEE, pp 1–7

  35. Ooi KEBrian, Low LSA, Lech M, Allen N (2012) Early prediction of major depression in adolescents using glottal wave characteristics and Teager energy parameters. In: 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 4613–4616

  36. Ptsd and dsm-5 (2016) http://www.ptsd.va.gov/professional/PTSD-overview/dsm_criteria_ptsd.asp. Accessed 10 July 2016

  37. Ptsd and symptoms (2018) https://www.ptsd.va.gov/public/ptsd-overview/basics/symptoms_of_ptsd.asp. Accessed 20 June 2018

  38. Pan SJ, Yang Q (2010) A survey on transfer learning. EEE Trans Knowl Data Eng 22(10):1345–1359

    Article  Google Scholar 

  39. Pitman RK (1989) Post-traumatic stress disorder, hormones, and memory. Biol Psychiatr 26(3):221–223

    Article  Google Scholar 

  40. Pratt LY (1993) Discriminability-based transfer between neural networks. In: Advances in neural information processing systems, pp 204–211

  41. Ramaswamy S, Madaan V, Qadri F, Heaney CJ, North TC, Padala PR, Sattar SP, Petty F (2005) A primary care perspective of posttraumatic stress disorder for the department of veterans affairs. Prim Care Compan J Clin Psychiatr 7(4):180

    Article  Google Scholar 

  42. Rozgic V, Vazquez-Reina A, Crystal M, Srivastava A, Tan V, Berka C (2014) Multi-modal prediction of ptsd and stress indicators. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 3636–3640

  43. Scherer S, Lucas GM, Gratch J, Rizzo AS, Morency L-P (2016) Self-reported symptoms of depression and ptsd are associated with reduced vowel space in screening interviews. IEEE Trans Affect Comput 7(1):59–73

    Article  Google Scholar 

  44. Scherer S, Stratou G, Gratch J, Morency L-P (2013) Investigating voice quality as a speaker-independent indicator of depression and ptsd. In: Interspeech, pp 847–851

  45. Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 806–813

  46. Sparr LF, Bremner JD (2005) Post-traumatic stress disorder and memory prescient medicolegal testimony at the international war crimes tribunal? J Am Acad Psychiatr Law Online 33(1):71–78

    Google Scholar 

  47. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  48. van den Broek EL, van der Sluis F, Dijkstra T (2010) Telling the story and re-living the past: how speech analysis can reveal emotions in post-traumatic stress disorder (ptsd) patients. In: Sensing emotions, Springer, pp 153–180

  49. Vergyri D, Knoth B, Shriberg E, Mitra V, McLaren M, Ferrer L, Garcia P, Marmar C (2015) Speech-based assessment of ptsd in a military population using diverse feature classes. In: Sixteenth annual conference of the international speech communication association

  50. Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83

    Article  Google Scholar 

  51. Young A (1997) The harmony of illusions: inventing post-traumatic stress disorder. Princeton University Press, Princeton

    Book  Google Scholar 

  52. Zhang Q, Wu Q, Zhu H, He L, Huang H, Zhang J, Zhang W (2016) Multimodal MRI-based classification of trauma survivors with and without post-traumatic stress disorder. Front Neurosci 10:292

    Google Scholar 

  53. Zhang W, Li R, Zeng T, Sun Q, Kumar S, Ye J, Ji S (2016) Deep model based transfer and multi-task learning for biological image analysis. In: IEEE transactions on big data

  54. Zhuang X, Rozgić V, Crystal M, Marx BP (2014) Improving speech-based ptsd detection via multi-view learning. In: Spoken language technology workshop (SLT), 2014 IEEE, pp 260–265

Download references

Acknowledgements

This research is partially supported by DOD under grant W81XWH-15-C-0099. The authors would like to thank UHCMC for providing the Ohio dataset. The support of NVIDIA Corporation for the donation of the TESLA K40 GPU used in this research is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kazi Islam.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendices

1.1 Youtube dataset

Following are Youtube links to the 26 subjects in “Youtube Data” utilized in our study (the last 8 links were not accessible as of November 17, 2018):

  1. (1)

    http://www.youtube.com/user/jay10221979

  2. (2)

    http://video.pbs.org/video/1506939466/

  3. (3)

    http://www.youtube.com/user/veteransPTSD#p/a/76FE97306FBC00C3/1/vPPiFrwCrSI

  4. (4)

    http://www.youtube.com/watch?v=bsFg8wZuI-4

  5. (5)

    http://www.youtube.com/watch?v=1BoKtZ_z-sg&feature=fvw

  6. (6)

    http://www.youtube.com/watch?v=u-W6_0gCVmE

  7. (7)

    https://www.youtube.com/watch?v=US9J2o16bJE

  8. (8)

    http://www.youtube.com/watch?v=SwiONICQe1w

  9. (9)

    https://www.youtube.com/watch?v=KrMY1eqEVPs

  10. (10)

    https://www.youtube.com/watch?v=TKsD-a3XKlY

  11. (11)

    http://www.youtube.com/user/SpecialOpsM4#p/u/17/hlQlUgGy_gs

  12. (12)

    http://www.youtube.com/user/millerusaf#p/search/6/27dEAmlyDVw

  13. (13)

    http://www.youtube.com/user/xXjOmAmMaXx#p/u/10/CxwbQq8B0xw

  14. (14)

    http://www.youtube.com/user/deathcoreairsoft#p/search/11/jCZpcvRZvQk

  15. (15)

    http://www.youtube.com/user/spartan765#p/search/2/82YT5kHmwo8

  16. (16)

    http://www.youtube.com/user/VulcanMarine#p/search/0/WIrXu4q4hCU

  17. (17)

    http://www.youtube.com/user/DuskMarksmen#p/u/22/WEQXBAcMT8U

  18. (18)

    http://www.youtube.com/watch?v=MBN22B3O_J8

  19. (19)

    http://www.kewego.com/video/iLyROoafYsJp.html

  20. (20)

    http://www.youtube.com/watch?v=WP8HEqXbQdo

  21. (21)

    http://video.google.com/videoplay?docid=6545321893646982640#

  22. (22)

    http://video.google.com/videoplay?docid=6545321893646982640#docid365676523829472362

  23. (23)

    http://www.youtube.com/user/TheAirsoftReviewer1#p/search/0/aWxNfIZQj84

  24. (24)

    http://www.youtube.com/user/ResidentEman#p/u/6/w8xos9pCcik

  25. (25)

    http://www.youtube.com/user/SteveBanke#p/a/u/0/Gr1w8XTL8sO

  26. (26)

    http://www.youtube.com/user/vanguardwolf1#p/u/6/AVRzzRezD_E

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Banerjee, D., Islam, K., Xue, K. et al. A deep transfer learning approach for improved post-traumatic stress disorder diagnosis. Knowl Inf Syst 60, 1693–1724 (2019). https://doi.org/10.1007/s10115-019-01337-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-019-01337-2

Keywods

Navigation