skip to main content
10.1145/2988257.2988265acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Staircase Regression in OA RVM, Data Selection and Gender Dependency in AVEC 2016

Published: 16 October 2016 Publication History

Abstract

Within the field of affective computing, human emotion and disorder/disease recognition have progressively attracted more interest in multimodal analysis. This submission to the Depression Classification and Continuous Emotion Prediction challenges for AVEC2016 investigates both, with a focus on audio subsystems. For depression classification, we investigate token word selection, vocal tract coordination parameters computed from spectral centroid features, and gender-dependent classification systems. Token word selection performed very well on the development set. For emotion prediction, we investigate emotionally salient data selection based on emotion change, an output-associative regression approach based on the probabilistic outputs of relevance vector machine classifiers operating on low-high class pairs (OA RVM-SR), and gender-dependent systems. Experimental results from both the development and test sets show that the RVM-SR method under the OA framework can improve on OA RVM, which performed very well in the AV+EC2015 challenge.

References

[1]
Newman, S. and V. Mather, "Analysis of spoken language of patients with affective disorders," American Journal of Psychiatry, vol. 94, no. 4, pp. 913--942, 1938.
[2]
Cummins, N., S. Scherer, et al., "A review of depression and suicide risk assessment using speech analysis," Speech Communication, vol. 71, pp. 10--49, Jul. 2015.
[3]
Valstar, M., B. Schuller, et al., "Avec 2014: 3d dimensional affect and depression recognition challenge," in Proceedings of the 4th International Workshop on AVEC, ACM MM, 2014, pp. 3--10.
[4]
Ringeval, F., B. Schuller, et al., "AV+EC 2015 -- The First Affect Recognition Challenge Bridging Across Audio, Video, and Physiological Data," in Proceedings of the 5th International Workshop on AVEC, ACM MM, 2015, pp. 3--8.
[5]
Valstar, M., J. Gratch, et al., "AVEC 2016 -- Depression, Mood, and Emotion Recognition Workshop and Challenge," in Proceedings of the 6th International Workshop on AVEC, ACM MM, 2016.
[6]
Gratch, J., R. Artstein, et al., "The Distress Analysis Interview Corpus of human and computer interviews," in Proceedings of Language Resources and Evaluation Conference (LREC), 2014.
[7]
Asgari, M., I. Shafran, et al., "Inferring clinical depression from speech and spoken utterances," in 2014 IEEE International Workshop on Machine Learning for Signal Processing, 2014.
[8]
Cummins, N., J. Epps, et al., "An Investigation of Depressed Speech Detection: Features and Normalization.," in INTERSPEECH, 2011.
[9]
Cummins, N., J. Joshi, et al., "Diagnosis of depression by behavioural signals," in Proceedings of the 3rd ACM international workshop on AVEC, ACM MM, 2013, pp. 11--20.
[10]
Hébert, M., "Text-dependent speaker recognition," Springer handbook of speech processing, pp. 743--762, 2008.
[11]
Alghowinem, S., R. Goecke, et al., "Detecting depression: a comparison between spontaneous and read speech," in ICASSP, 2013, pp. 7547--7551.
[12]
Moore, J., L. Tian, et al., "Word-level emotion recognition using high-level features," in International Conference on Intelligent Text Processing and Computational Linguistics, 2014, pp. 17--31.
[13]
Hönig, F., A. Batliner, et al., "Automatic modelling of depressed speech: relevant features and relevance of gender.," in INTERSPEECH, 2014, pp. 1248--1252.
[14]
Alghowinem, S., R. Goecke, et al., "From Joyous to Clinically Depressed: Mood Detection Using Spontaneous Speech," in FLAIRS, 2012.
[15]
Hussenbocus, A. and M. Lech, "Statistical differences in speech acoustics of major depressed and non-depressed adolescents," in ICSPCS, 2015, pp. 1--7.
[16]
Scherer, S., G. Stratou, et al., "Automatic Nonverbal Behavior Indicators of Depression and PTSD?: Exploring Gender Differences," in Humaine Association Conference on ACII, 2013.
[17]
Scherer, S., G. Stratou, et al., "Automatic Behavior Descriptors for Psychological Disorders," in 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), 2013, pp. 1--8.
[18]
Troisi, A. and A. Moles, "Gender differences in depression: an ethological study of nonverbal behavior during interviews," Journal of Psychiatric Research, vol. 33, no. 3, pp. 243--250, 1999.
[19]
Williamson, J. R., T. F. Quatieri, et al., "Vocal biomarkers of depression based on motor incoordination," in Proceedings of the 4th ACM International Workshop on AVEC, ACM MM, 2013.
[20]
Williamson, J., T. Quatieri, et al., "Vocal and facial biomarkers of depression based on motor incoordination and timing," in Proceedings of the 4th International Workshop on AVEC, 2014.
[21]
Williamson, J. R., D. W. Bliss, et al., "Seizure prediction using EEG spatiotemporal correlation structure.," Epilepsy & behavior?: E&B, vol. 25, no. 2, pp. 230--8, 2012.
[22]
Ringeval, F., A. Sonderegger, et al., "Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions," in 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG)., 2013, pp. 1--8.
[23]
Sethu, V., E. Ambikairajah, et al., "Phonetic and speaker variations in automatic emotion classification," in INTERSPEECH, 2008.
[24]
Lee, C., S. Yildirim, et al., "Emotion recognition based on phoneme classes.," in INTERSPEECH, 2004, pp. 889--892.
[25]
Bitouk, D., R. Verma, et al., "Class-level spectral features for emotion recognition," Speech Communication, vol. 52, pp. 613--625, 2010.
[26]
Le, D. and E. Provost, "Data selection for acoustic emotion recognition: Analyzing and comparing utterance and sub-utterance selection strategies," in ACII, 2015, pp. 146--152.
[27]
Tipping, M., "Sparse Bayesian learning and the relevance vector machine," The Journal of Machine Learning Research, vol. 1, pp. 211--244, 2001.
[28]
Nicolaou, M. a., H. Gunes, et al., "Output-associative RVM regression for dimensional and continuous emotion prediction," Image and Vision Computing, vol. 30, no. 3, pp. 186--196, 2012.
[29]
Huang, Z., T. Dang, et al., "An Investigation of Annotation Delay Compensation and Output-Associative Fusion for Multimodal Continuous Emotion Prediction," in Proceedings of the 5th International Workshop on AVEC, ACM MM, 2015.
[30]
Grimm, M., K. Kroschel, et al., "Primitives-based evaluation and estimation of emotions in speech," Speech Communication, vol. 49, pp. 787--800, 2007.
[31]
Vogt, T. and E. André, "Improving automatic emotion recognition from speech via gender differentiation," in Proceedings of Language Resources and Evaluation Conference (LREC), 2006.
[32]
Xia, R., J. Deng, et al., "Modeling gender information for emotion recognition using denoising autoencoder," in ICASSP, 2014.
[33]
Paliwal, K., "Spectral subband centroid features for speech recognition," in ICASSP, 1998, pp. 617--620.
[34]
Eyben, F., K. Scherer, et al., "The Geneva Minimalistic Acoustic Parameter Set ( GeMAPS ) for Voice Research and Affective Computing," IEEE Transactions on Affective Computing, vol. 7, no. 2, pp. 190--202, 2015.
[35]
Eyben, F., F. Weninger, et al., "Recent developments in opensmile, the munich open-source multimedia feature extractor," in Proceedings of the 21st ACM international conference on Multimedia, 2013, pp. 835--838.
[36]
Cummins, N., "Automatic Assessment of Depression from Speech: Paralinguistic Analysis, Modelling and Machine Learning," PhD Thesis, 2016.

Cited By

View all
  • (2024)Improving Performance of Speech Emotion Recognition Application using Extreme Learning Machine and Utterance-level2024 International Seminar on Intelligent Technology and Its Applications (ISITIA)10.1109/ISITIA63062.2024.10668153(466-470)Online publication date: 10-Jul-2024
  • (2023)Ordinal Logistic Regression With Partial Proportional Odds for Depression PredictionIEEE Transactions on Affective Computing10.1109/TAFFC.2020.303130014:1(563-577)Online publication date: 1-Jan-2023
  • (2022)Continuous Emotion Recognition for Long-Term Behavior Modeling through Recurrent Neural NetworksTechnologies10.3390/technologies1003005910:3(59)Online publication date: 12-May-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AVEC '16: Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge
October 2016
114 pages
ISBN:9781450345163
DOI:10.1145/2988257
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 October 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. annotation delay compensation
  2. depression classification
  3. dimensional emotion prediction
  4. gender dependence
  5. multimodal fusion
  6. output-associative fusion
  7. relevance vector machine
  8. token word selection

Qualifiers

  • Research-article

Conference

MM '16
Sponsor:
MM '16: ACM Multimedia Conference
October 16, 2016
Amsterdam, The Netherlands

Acceptance Rates

AVEC '16 Paper Acceptance Rate 12 of 14 submissions, 86%;
Overall Acceptance Rate 52 of 98 submissions, 53%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)28
  • Downloads (Last 6 weeks)6
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Improving Performance of Speech Emotion Recognition Application using Extreme Learning Machine and Utterance-level2024 International Seminar on Intelligent Technology and Its Applications (ISITIA)10.1109/ISITIA63062.2024.10668153(466-470)Online publication date: 10-Jul-2024
  • (2023)Ordinal Logistic Regression With Partial Proportional Odds for Depression PredictionIEEE Transactions on Affective Computing10.1109/TAFFC.2020.303130014:1(563-577)Online publication date: 1-Jan-2023
  • (2022)Continuous Emotion Recognition for Long-Term Behavior Modeling through Recurrent Neural NetworksTechnologies10.3390/technologies1003005910:3(59)Online publication date: 12-May-2022
  • (2022)Investigation of Speech Landmark Patterns for Depression DetectionIEEE Transactions on Affective Computing10.1109/TAFFC.2019.294438013:2(666-679)Online publication date: 1-Apr-2022
  • (2021)Compensation Techniques for Speaker Variability in Continuous Emotion PredictionIEEE Transactions on Affective Computing10.1109/TAFFC.2018.288304412:2(439-452)Online publication date: 1-Apr-2021
  • (2021)Multimodal Emotion Recognition Based on Speech and Physiological Signals Using Deep Neural NetworksPattern Recognition. ICPR International Workshops and Challenges10.1007/978-3-030-68780-9_25(289-300)Online publication date: 25-Feb-2021
  • (2020)An efficient model-level fusion approach for continuous affect recognition from audiovisual signalsNeurocomputing10.1016/j.neucom.2019.09.037376:C(42-53)Online publication date: 1-Feb-2020
  • (2019)Automatic Assessment of Depression Based on Visual Cues: A Systematic ReviewIEEE Transactions on Affective Computing10.1109/TAFFC.2017.272403510:4(445-470)Online publication date: 1-Oct-2019
  • (2019)Dynamic Facial Models for Video-Based Dimensional Affect Estimation2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)10.1109/ICCVW.2019.00200(1608-1617)Online publication date: Oct-2019
  • (2019)Speech Landmark Bigrams for Depression Detection from Naturalistic Smartphone SpeechICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2019.8682916(5856-5860)Online publication date: May-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media