ABSTRACT
This paper addresses the AVEC 2017 ? Depression Sub-Challenge, where the objective is to propose methods which can aid automated prediction of depression severity. In this paper, we specifically focus on biomarkers of psychomotor retardation, which are a key trait of depressive episodes, to propose three sets of methods.
We propose a novel set of temporal features (which we called "turbulence features") and show their effectiveness. We offer a novel methodology to target specific craniofacial movements indicative of psychomotor retardation and hence of depression. Further, we present a novel method for quantifying abnormalities of speech spectra of individuals with depression using Fisher vector encoding of spectral low level descriptors (LLDs).
So far, in the AVEC challenge on prediction of patient health questionnaire (PHQ) scores on the Test set, we achieve a root mean square error (RMSE) score of 6.34 and a mean absolute error (MAE) score of 5.30, both of which are better than the best results on the AVEC test set as given in the baseline paper i.e. 6.97 and 5.66, respectively. This suggests that our method is a viable proof of concept and may lead to fully automated objective depression screening protocols.
- Sharifa Alghowinem, Roland Goecke, Julien Epps, Michael Wagner, and Jeffrey Cohn. 2016. Cross-Cultural Depression Recognition from Vocal Biomarkers. In INTERSPEECH 2016.Google ScholarCross Ref
- S Alghowinem, R Goecke, M Wagner, J Epps, M Hyett, G Parker, and M Breakspear. 2016. Multimodal Depression Detection: Fusion Analysis of Paralinguistic, Head Pose and Eye Gaze Behaviors. IEEE Trans. Affect. Comput. 99 (2016), 1--14.Google Scholar
- American Psychiatric Association. 2013. DSM-5. 4--5 pages.Google Scholar
- Chih-Wei Hsu, Chih-Chung Chang and Chih-Jen Lin. 2008. A Practical Guide to Support Vector Classification. BJU Int. 101, 1 (2008), 1396--400.Google ScholarCross Ref
- Alex S Cohen, Jessica E McGovern, Thomas J Dinzeo, and Michael A Covington. 2014. Speech Deficits in Serious mental Illness: A Cognitive Resource Issue? Schizophr. Res. 160, 0 (dec 2014), 173--179.Google ScholarCross Ref
- Alex S Cohen, Tyler L Renshaw, Kyle R Mitchell, and Yunjung Kim. 2016. A psychometric investigation of "macroscopic" speech measures for clinical and psychological science. Behav. Res. Methods 48, 2 (2016), 475--486.Google ScholarCross Ref
- Nicholas Cummins, Stefan Scherer, Jarek Krajewski, Sebastian Schnieder, Julien Epps, and Thomas F. Quatieri. 2015. A review of depression and suicide risk assessment using speech analysis. Speech Commun. 71 (2015), 10--49. Google ScholarDigital Library
- G Degottex, J Kane, T Drugman, T Raitio, and S Scherer. 2014. COVAREP - A collaborative voice analysis repository for speech technologies. In Acoust. Speech Signal Process. 960--964.Google Scholar
- A Dhall and R Goecke. 2015. A temporally piece-wise fisher vector approach for depression analysis. In ACII. 255--259. Google ScholarDigital Library
- Hamdi Dibekliouglu, Zakia Hammal, Ying Yang, and Jeffrey F Cohn. 2015. Multi- modal Detection of Depression in Clinical Interviews. In IMCI 2015. 307--310. Google ScholarDigital Library
- Joshua John Diehl and Rhea Paul. 2011. Acoustic and perceptual measurements of prosody production on the profiling elements of prosodic systems in children by children with autism spectrum disorders. Appl. Psycholinguist. 34 (2011), 1--27.Google Scholar
- Dieter Ebert, Roland Albert, Gerhard Hammon, Bernhard Strasser, Albrecht May, and Antje Merz. 1996. Eye-blink rates and depression. Is the antidepressant effect of sleep deprivation mediated by the dopamine system? Neuropsychopharmacology 15, 4 (1996), 332--339.Google ScholarCross Ref
- F Eyben, K R Scherer, B W Schuller, J Sundberg, E André, C Busso, L Y Devillers, J Epps, P Laukka, S S Narayanan, and K P Truong. 2016. The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing. IEEE Trans. Affect. Comput. 7, 2 (2016), 190--202.Google ScholarDigital Library
- Florian Eyben, Felix Weninger, Florian Gross, and Bjorn Schuller. {n. d.}. Re- cent developments in openSMILE, the munich open-source multimedia feature extractor. In ACM MM 2013. 835--838. Google ScholarDigital Library
- Jeffrey M Girard and Jeffrey F Cohn. 2014. Automated Audiovisual Depression Analysis. Curr. Opin. Psychol. 4 (2014).Google Scholar
- P Gorwood, S Richard-Devantoy, F Baylé, and M L Cléry-Melun. 2014. Psychomotor retardation is a scar of past depressive episodes, revealed by simple cognitive tests. Eur. Neuropsychopharmacol. 24, 10 (oct 2014), 1630--1640.Google ScholarCross Ref
- Jonathan Gratch, Ron Artstein, Gale Lucas, Giota Stratou, Stefan Scherer, Angela Nazarian, Rachel Wood, Jill Boberg, David DeVault, Stacy Marsella, David Traum, Albert "Skip" Rizzo, and Louis-Philippe Morency. 2014. The Distress Analysis Interview Corpus of Human and Computer Interviews. In Int. Conf. Lang. Resour. Eval. 3123--3128.Google Scholar
- Juha Hämäläinen, Kari Poikolainen, Erkki Isometsä, Jaakko Kaprio, Martti Heikki- nen, Sari Lindeman, and Hillevi Aro. 2005. Major depressive episode related to long unemployment and frequent alcohol intoxication. Nord. J. Psychiatry 59, 6 (2005), 486--91.Google ScholarCross Ref
- Varun Jain, James L Crowley, Anind K Dey, and Augustin Lux. 2014. Depression Estimation Using Audiovisual Features and Fisher Vector Encoding. In AVEC 2014 (AVEC '14). ACM, New York, NY, USA, 87--91. Google ScholarDigital Library
- Heysem Kaya, Furkan Gürpinar, Sadaf Afshar, and Albert Ali Salah. 2015. Con- trasting and Combining Least Squares Based Learners for Emotion Recognition in the Wild. In ACMI (ICMI '15). ACM, New York, NY, USA, 459--466. Google ScholarDigital Library
- Heysem Kaya and Alexey A. Karpov. 2016. Fusing Acoustic Feature Representa- tions for Computational Paralinguistics Tasks. In INTERSPEECH 2016. 2046--2050.Google Scholar
- Heysem Kaya, Alexey A. Karpov, and Albert Ali Salah. 2015. Fisher Vectors with Cascaded Normalization for Paralinguistic Analysis. In INTERSPEECH 2015. 909--913.Google Scholar
- Lars Vedel Kessing. 2012. Depression and the risk for dementia. Curr. Opin. Psychiatry 25, 6 (2012), 457--461.Google ScholarCross Ref
- Kurt Kroenke, Tara W Strine, Robert L Spitzer, Janet B W Williams, Joyce T Berry, and Ali H Mokdad. 2009. The PHQ-8 as a measure of current depression in the general population. J. Affect. Disord. 114, 1'3 (2009), 163--173.Google ScholarCross Ref
- S.H.R.E. Motlagh, H Moradi, and H Pouretemad. 2013. Using general sound descriptors for early autism detection. In Control Conf. (ASCC), 2013. 1--5.Google ScholarCross Ref
- Paula M Niedenthal, Lawrence W Barsalou, Piotr Winkielman, Silvia Krauth-Gruber, and François Ric. 2005. Embodiment in Attitudes, Social Perception, and Emotion. Personal. Soc. Psychol. Rev. 9, 3 (2005), 184--211.Google ScholarCross Ref
- Florent Perronnin and Christopher Dance. 2007. Fisher kernels on visual vocab- ularies for image categorization. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit.Google Scholar
- Florent Perronnin, Jorge Sánchez, and Thomas Mensink. 2010. Improving the Fisher kernel for large-scale image classification. In Lect. Notes Comput. Sci., Vol. 6314. 143--156. Google ScholarDigital Library
- Douglas A. Reynolds and Richard C. Rose. 1995. Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models. IEEE Trans. Speech Audio Process. 3, 1 (1995), 72--83.Google ScholarCross Ref
- Fabien Ringeval, Bjorn Schuller, Michel Valstar, Jonathan Gratch, Roddy Cowie, Stefan Scherer, Sharon Mozgai, Nicholas Cummins, Maximilian Schmitt, and Maja Pantic. 2017. AVEC 2017 ' Real-life Depression, and Affect Recognition Workshop and Challenge. In AVEC. Google ScholarDigital Library
- Roman Rosipal and Nicole Kr. 2006. Overview and Recent Advances in Partial Least Squares. Subspace, Latent Struct. Featur. Sel. Saunders, C., al. (heidelb. Springer-Verlag, 2006) 3940 (2006), 34--51. Google ScholarDigital Library
- Didier Schrijvers, Wouter Hulstijn, and Bernard G C Sabbe. 2008. Psychomotor symptoms in depression: A diagnostic, pathophysiological and therapeutic tool. (2008), 20 pages.Google Scholar
- Christina Sobin and Harold A. Sackeim. 1997. Psychomotor symptoms of depression. Am. J. Psychiatry 154, 1 (1997), 4--17.Google ScholarCross Ref
- Michel Valstar. 2014. Automatic Behaviour Understanding in Medicine. In Proc. 2014 Work. Roadmapping Futur. Multimodal Interact. Res. Incl. Bus. Oppor. Challenges (RFMIR '14). ACM, New York, NY, USA, 57--60. Google ScholarDigital Library
- Michel Valstar, Jonathan Gratch, Bjorn Schuller, Fabien Ringeval, Denis Lalanne, Mercedes Torres Torres, Stefan Scherer, Guiota Stratou, Roddy Cowie, and Maja Pantic. 2016. AVEC 2016 - Depression, Mood, and Emotion Recognition Workshop and Challenge. In Int. Work. Audio/Visual Emot. Chall. Google ScholarDigital Library
- M. Valstar, B. Schuller, K. Smith, F. Eyben, B. Jiang, S. Bilakhia, S. Schnieder, R. Cowie, and M. Pantic. 2013. AVEC 2013 - the continuous audio/visual emotion and depression recognition challange. In Proc. 3rd Int. Audio/Visual Emot. Chall. Work. Google ScholarDigital Library
- A. Vedaldi and B. Fulkerson. 2008. {VLFeat}: An Open and Portable Library of Computer Vision Algorithms. (2008).Google Scholar
- World Health Organisation. 2017. Depression fact sheet. (2017). http://www.who.int/mediacentre/factsheets/fs369/en/Google Scholar
- Katherine S Young, Christine E Parsons, Alan Stein, and Morten L Kringelbach. 2015. Motion and emotion: depression reduces psychomotor performance and alters affective movements in caregiving interactions. Front. Behav. Neurosci. 9, February (2015), 26.Google Scholar
Index Terms
- Depression Severity Prediction Based on Biomarkers of Psychomotor Retardation
Recommendations
AVEC 2017: Real-life Depression, and Affect Recognition Workshop and Challenge
AVEC '17: Proceedings of the 7th Annual Workshop on Audio/Visual Emotion ChallengeThe Audio/Visual Emotion Challenge and Workshop (AVEC 2017) "Real-life depression, and affect" will be the seventh competition event aimed at comparison of multimedia processing and machine learning methods for automatic audiovisual depression and ...
Summary for AVEC 2016: Depression, Mood, and Emotion Recognition Workshop and Challenge
MM '16: Proceedings of the 24th ACM international conference on MultimediaThe sixth Audio-Visual Emotion Challenge and workshop AVEC 2016 was held in conjunction ACM Multimedia'16. This year the AVEC series addresses two distinct sub-challenges, multi-modal emotion recognition and audio-visual depression detection. Both sub-...
Summary for AVEC 2017: Real-life Depression and Affect Challenge and Workshop
MM '17: Proceedings of the 25th ACM international conference on MultimediaThe seventh Audio-Visual Emotion Challenge and workshop AVEC 2017 was held in conjunction with ACM Multimedia'17. This year, the AVEC series addresses two distinct sub-challenges: emotion recognition and depression detection. The Affect Sub-Challenge is ...
Comments