Time Dependent ARMA for Automatic Recognition of Fear-Type Emotions in Speech

Vásquez-Correa, J. C.; Orozco-Arroyave, J. R.; Arias-Londoño, J. D.; Vargas-Bonilla, J. F.; Avendaño, L. D.; Nöth, Elmar

doi:10.1007/978-3-319-24033-6_11

J. C. Vásquez-Correa¹⁵,
J. R. Orozco-Arroyave^15,16,
J. D. Arias-Londoño¹⁵,
J. F. Vargas-Bonilla¹⁵,
L. D. Avendaño¹⁷ &
…
Elmar Nöth¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9302))

Included in the following conference series:

International Conference on Text, Speech, and Dialogue

1819 Accesses

Abstract

The speech signals are non-stationary processes with changes in time and frequency. The structure of a speech signal is also affected by the presence of several paralinguistics phenomena such as emotions, pathologies, cognitive impairments, among others. Non-stationarity can be modeled using several parametric techniques. A novel approach based on time dependent auto-regressive moving average (TARMA) is proposed here to model the non-stationarity of speech signals. The model is tested in the recognition of “fear-typeo” emotions in speech. The proposed approach is applied to model syllables and unvoiced segments extracted from recordings of the Berlin and enterface05 databases. The results indicate that TARMA models can be used for the automatic recognition of emotions in speech.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Schuller, B., Batliner, A.: Computational Paralinguistics: Emotion, Affect and Personality in Speech and Language Processing. Wiley (2014)
Google Scholar
Schuller, B., Batliner, A., Steidl, S., Seppi, D.: Recognising Realistic Emotions and Affect in Speech: State of the Art and Lessons Learnt from the First Challenge. Speech Communication 53(9–10), 1062–1087 (2011)
Article Google Scholar
Clavel, C., Vasilescu, I., Devillers, L., Richard, G., Ehrette, T.: Fear-type emotion recognition for future audio-based surveillance systems. Speech Communication 50(6), 487–503 (2008)
Article Google Scholar
Eyben, F., Batliner, A., Schuller, B.: Towards a standard set of acoustic features for the processing of emotion in speech. Proceedings of Meetings on Acoustics 9(1), 1–12 (2012)
Google Scholar
Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A database of german emotional speech. In: Proc of the INTERSPEECH 2005, pp. 1517–1520 (2005)
Google Scholar
Martin, O., Kotsia, I., Macq, B., Pitas, I.: The enterface 2005 audio-visual emotion database. In: Proceedings of the 22nd International Conference on Data Engineering Workshops. ICDEW 2006, pp. 8–15 (2006)
Google Scholar
Li, L., Zhao, Y., Jiang, D., Zhang, Y., Wang, F., Gonzalez, I., Valentin, E., Sahli, H.: Hybrid deep neural network-hidden markov model (DNN-HMM) based speech emotion recognition. In: Proceedings of the 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction (2013)312–317
Google Scholar
Henríquez, P., Alonso, J.B., Ferrer, M.A., Travieso, C.M., Orozco-Arroyave, J.R.: Nonlinear dynamics characterization of emotional speech. Neurocomputing 132, 126–135 (2014)
Article Google Scholar
Tüske, Z., Drepper, F.R., Schlüter, R.: Non-stationary signal processing and its application in speech recognition. In: Workshop on Statistical and Perceptual Audition, Portland, OR, USA, September 2012
Google Scholar
Ishi, C.T., Ishiguro, H., Hagita, N.: Analysis of the roles and the dynamics of breathy and whispery voice qualities in dialogue speech. EURASIP J. Audio, Speech and Music Processing 2010 (2010)
Google Scholar
Funaki, K.: A time-varying complex AR speech analysis based on GLS and ELS method. In: Eurospeech, pp. 1–4 (2001)
Google Scholar
Poulimenos, A., Fassois, S.: Parametric time-domain methods for non-stationary random vibration modelling and analysis a critical survey and comparison. Mechanical Systems and Signal Processing 20(4), 763–816 (2006)
Article Google Scholar
Fouskitakis, G.N., Fassois, S.D.: Functional series TARMA modelling and simulation of earthquake ground motion. Earthquake Engineering & Structural Dynamics 31(2), 399–420 (2002)
Article Google Scholar
Avendaño Valencia, L.D., Fassois, S.D.: Generalized stochastic Constraint TARMA models for in-operation identification of wind turbine non-stationary dynamics. Key Engineering Materials 569, 587–594 (2013)
Article Google Scholar
Rudoy, D., Quatieri, T.F., Wolfe, P.J.: Time-varying autoregressive tests for multiscale speech analysis. In: INTERSPEECH, pp. 2839–2842 (2009)
Google Scholar
Vásquez-Correa, J.C., Garcia, N., Vargas-Bonilla, J.F., Orozco-Arroyave, J.R., Arias-Londoño, J.D., Quintero, O.L.: Evaluation of wavelet measures on automatic detection of emotion in noisy and telephony speech signals. In: 2014 International Carnahan Conference on Security Technology (ICCST), pp. 1–6, October 2014
Google Scholar
Boersma, P., Weenink, D.: Praat, a system for doing phonetics by computer. Glot International 5(9/10), 341–345 (2001)
Google Scholar
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian mixture models. Digital Signal Processing 10(1–3), 19–41 (2000)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Engineering, Universidad de Antioquia UdeA, Medellín, Colombia
J. C. Vásquez-Correa, J. R. Orozco-Arroyave, J. D. Arias-Londoño & J. F. Vargas-Bonilla
Pattern Recognition Lab, Friedrich-Alexander-Universität, Erlangen-Nürnberg, Germany
J. R. Orozco-Arroyave & Elmar Nöth
Laboratory for Stochastic Mechanical Systems and Automation (SMSA), Department of Mechanical and Aeronautical Engineering, University of Patras, Patras, Greece
L. D. Avendaño

Authors

J. C. Vásquez-Correa
View author publications
You can also search for this author in PubMed Google Scholar
J. R. Orozco-Arroyave
View author publications
You can also search for this author in PubMed Google Scholar
J. D. Arias-Londoño
View author publications
You can also search for this author in PubMed Google Scholar
J. F. Vargas-Bonilla
View author publications
You can also search for this author in PubMed Google Scholar
L. D. Avendaño
View author publications
You can also search for this author in PubMed Google Scholar
Elmar Nöth
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to J. C. Vásquez-Correa .

Editor information

Editors and Affiliations

University of West Bohemia, Pilsen, Czech Republic
Pavel Král
University of West Bohemia, Pilsen, Czech Republic
Václav Matoušek

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vásquez-Correa, J.C., Orozco-Arroyave, J.R., Arias-Londoño, J.D., Vargas-Bonilla, J.F., Avendaño, L.D., Nöth, E. (2015). Time Dependent ARMA for Automatic Recognition of Fear-Type Emotions in Speech. In: Král, P., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2015. Lecture Notes in Computer Science(), vol 9302. Springer, Cham. https://doi.org/10.1007/978-3-319-24033-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-24033-6_11
Published: 11 December 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24032-9
Online ISBN: 978-3-319-24033-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics