Skip to main content

Duration Modeling for Emotional Speech

  • Conference paper
  • 4753 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7473))

Abstract

Human interaction involves exchanging not only explicit content, but also implicit information about the affective state of the interlocutor. In recent years, researchers attempt to endow the computers or robots with humanity. Various affective computing models have been proposed, which covers the areas of emotion recognition, interpretation, management and generation. Therefore, to analyze and predict the prosodic information of different emotions is very important for the future applications. In this article, a duration modeling approach for emotional speech is presented. Seven kinds of emotion including natural, scare, angry, elation, sadness, surprise, and disgust are adopted. According to the statistics performed on a corpus with seven emotions, a question set considering acoustic and linguistic factors is designed. Experimental results show that the root mean squared errors (RMSEs) of syllable are 0.0725s and 0.0802 s for training and testing sets correspondingly. From the results, the impact of factors related to different emotions can be explored.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Wu, C.H., Liang, W.B.: Emotion Recognition of Affective Speech Based on Multiple Classifiers Using Acoustic-Prosodic Information and Semantic Labels. IEEE Trans. on Affective Computing 2(1) (2011)

    Google Scholar 

  2. Koolagudi, S.G., Kumar, N., Rao, K.S.: Speech Emotion Recognition Using Segmental Level Prosodic Analysis. In: ICDeCom (2011)

    Google Scholar 

  3. Luengo, I., Navas, E., Hernáez, I.: Feature Analysis and Evaluation for Automatic Emotion Identification in Speech. IEEE Trans. on Multimedia 12(6) (2010)

    Google Scholar 

  4. Lee, C.C., Mower, E., Busso, C., Lee, S., Narayanan, S.: Emotion Recognition Using a Hierarchical Binary Decision Tree Approach. Speech Communication 53 (2011)

    Google Scholar 

  5. Schuller, B., Batliner, A., Steidl, S., Seppi, D.: Recognising Realistic Emotions and Affect in Speech: State of the Art and Lessons Learnt from the First Challenge. Speech Comm. 53, 9–10 (2011)

    Google Scholar 

  6. Zeng, H.Z., Tu, J.L., Pianfetti Jr., B., Huang, T.S.: Audio–Visual Affective Expression Recognition Through Multistream Fused HMM. IEEE Trans. on Multimedia 10(4) (2008)

    Google Scholar 

  7. Slaney, M., McRoberts, G.: BabyEars: A Recognition System for Affective Vocalizations. Speech Communication 39 (2003)

    Google Scholar 

  8. Iida, A., Campbell, N., Higuchi, F., Yasumura, M.: A Corpus-based Speech Synthesis System with Emotion. Speech Communication 40 (2003)

    Google Scholar 

  9. Schröder, M.: Expressing Degree of Activation in Synthetic Speech. IEEE Trans. on Audio, Speech, and Language Processing 14(4) (2006)

    Google Scholar 

  10. Murray, I.R., Amott, J.L.: Synthesizing Emotions in Speech: Is It Time to Get Excited. In: Fourth International Conference on Spoken Language, vol. 3 (1996)

    Google Scholar 

  11. A1-Dakkak, O., Ghneim, N., Abou Zliekha, M., Al-Moubayed, S.: Prosodic Feature Introduction and Emotion Incorporation in an Arabic TTS. In: 2nd Information and Communication Technologies (2006)

    Google Scholar 

  12. Jiang, D.N., Zhang, W., Shen, L.Q., Cai, L.H.: Prosody Analysis and Modeling for Emotional Speech Synthesis. In: ICASSP (2005)

    Google Scholar 

  13. Vidya Sagar, T., Sreenivasa Rao, K., Prasanna, S.R.M., Dandapat, S.: Characterization and Incorporation of Emotions in Speech. In: IEEE INDICON (2006)

    Google Scholar 

  14. Strongman, K.T.: The Psychology of Emotion - Theories of Emotion in Perspective. Wu-Nan Book Inc., Taipei (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lai, WH., Wang, SL. (2012). Duration Modeling for Emotional Speech. In: Liu, B., Ma, M., Chang, J. (eds) Information Computing and Applications. ICICA 2012. Lecture Notes in Computer Science, vol 7473. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34062-8_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34062-8_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34061-1

  • Online ISBN: 978-3-642-34062-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics