Emotion modeling from speech signal based on wavelet packet transform

Degaonkar, Varsha N.; Apte, Shaila D.

doi:10.1007/s10772-012-9142-8

Emotion modeling from speech signal based on wavelet packet transform

Published: 11 April 2012

Volume 16, pages 1–5, (2013)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Varsha N. Degaonkar¹ &
Shaila D. Apte¹

634 Accesses
12 Citations
Explore all metrics

Abstract

The recognition of emotion in human speech has gained increasing attention in recent years due to the wide variety of applications that benefit from such technology. Detecting emotion from speech can be viewed as a classification task. It consists of assigning, out of a fixed set, an emotion category e.g. happiness, anger, to a speech utterance. In this paper, we have tackled two emotions namely happiness and anger. The parameters extracted from speech signal depend on speaker, spoken word as well as emotion. To detect the emotion, we have kept the spoken utterance and the speaker constant and only the emotion is changed. Different features are extracted to identify the parameters responsible for emotion. Wavelet packet transform (WPT) is found to be emotion specific. We have performed the experiments using three methods. Method uses WPT and compares the number of coefficients greater than threshold in different bands. Second method uses energy ratios of different bands using WPT and compares the energy ratios in different bands. The third method is a conventional method using MFCC. The results obtained using WPT for angry, happy and neutral mode are 85 %, 65 % and 80 % respectively as compared to results obtained using MFCC i.e. 75 %, 45 % and 60 % respectively for the three emotions. Based on WPT features a model is proposed for emotion conversion namely neutral to angry and neutral to happy emotion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agarwal, A., Jain, A., Prakash, N., & Agrawal, S. S. (2010). Word based emotion conversion in Hindi language. In ICCSIT’2010 proceedings, Chengdu (pp. 419–423).
Google Scholar
Burkhardt, F., Polzehl, T., Stegmann, J., Metze, F., & Huber, R. (2009). Detecting real life anger. In ICASSP’09 proceedings, Taipei, Taiwan (pp. 4761–4764).
Google Scholar
Hidayati, R., Purnama, I. K. E., & Purnomo, M. H. (2009). The extraction of acoustic features of infant cry for emotion detection based on pitch and formants. In ICICI-BME’09, proceedings, Bandung (pp. 1–5).
Google Scholar
Krajewski, J., Batliner, A., & Kessel, S. (2010). Comparing multiple classifiers for speech-based detection of self-confidence—a pilot study. In ICPR’2010 proceedings, Istanbul (pp. 3716–3719).
Google Scholar
Laskowski, K. (2010). Finding emotionally involved speech using implicitly proximity-annotated laughter. In ICASSP’2010 proceedings, Dallas, TX (pp. 5226–5229).
Google Scholar
Meshram, A. P., Shirbahadurkar, S. D., Kohok, A., & Jadhav, S. (2010). An overview and preparation for recognition of emotion from speech signal with multi modal fusion. In ICCAE’2010 proceedings, Singapore (pp. 446–452).
Google Scholar
Metze, F., Polzehl, T., & Wagner, M. (2009). Fusion of acoustic and linguistic features for emotion detection. In ICSC’09 proceedings, Berkeley (pp. 153–160).
Google Scholar
Tawari, A., & Trivedi, M. M. (2010). Speech emotion analysis: exploring the role of context. IEEE Transactions on Multimedia, 12(6), 502–509.
Article Google Scholar
Wooil, K., & Hansen, J. H. L. (2010). Angry emotion detection from real-life conversational speech by leveraging content structure. In ICASSP’2010, proceedings, Dallas, TX (pp. 5166–5169).
Google Scholar

Download references

Author information

Authors and Affiliations

RSCOE, Tathwade, Pune, India
Varsha N. Degaonkar & Shaila D. Apte

Authors

Varsha N. Degaonkar
View author publications
You can also search for this author in PubMed Google Scholar
Shaila D. Apte
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Varsha N. Degaonkar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Degaonkar, V.N., Apte, S.D. Emotion modeling from speech signal based on wavelet packet transform. Int J Speech Technol 16, 1–5 (2013). https://doi.org/10.1007/s10772-012-9142-8

Download citation

Received: 01 September 2011
Accepted: 28 March 2012
Published: 11 April 2012
Issue Date: March 2013
DOI: https://doi.org/10.1007/s10772-012-9142-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Emotion modeling from speech signal based on wavelet packet transform

Abstract

Access this article

Similar content being viewed by others

Speech Emotion Recognition Based on Wavelet Packet Coefficients

Speech Emotion Recognition Using Neural Network and Wavelet Features

Review of Discrete Wavelet Transform-Based Emotion Recognition from Speech

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Emotion modeling from speech signal based on wavelet packet transform

Abstract

Access this article

Similar content being viewed by others

Speech Emotion Recognition Based on Wavelet Packet Coefficients

Speech Emotion Recognition Using Neural Network and Wavelet Features

Review of Discrete Wavelet Transform-Based Emotion Recognition from Speech

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation