Conferences >2017 Seventh International Co...

Modeling variable length phoneme sequences — A step towards linguistic information for speech emotion recognition in wider world

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Vocal gestures play an important role in emotion expression and can be used by speech based emotion recognition systems. This paper proposes the use of BLSTM neural netwo...Show More

Metadata

Abstract:

Vocal gestures play an important role in emotion expression and can be used by speech based emotion recognition systems. This paper proposes the use of BLSTM neural networks to model salient variable length phoneme sequences, which in turn can represent relevant vocal gestures. Unlike existing techniques, the proposed approach is not restricted to modelling phoneme sequences of a fixed length and both salience and optimal modelling length of phoneme sequences are learnt from the training data. Three possible phoneme representations that can be modelled by BLSTMs are compared and experimental results suggest that sequences of Phone Log Likelihood Ratios are more representative of emotions when compared to sequences of phoneme labels represented as one — hot vectors. On the IEMOCAP database, the proposed approach achieves an Unweighted Average Recall (UAR) of 56.4%, an improvement of 6.5% in absolute terms over the previous approach of modelling fixed length phoneme sequences on a 4-class classification problem. The proposed linguistic system is complementary to acoustic features with a fused system leading to an absolute improvement of 5% to the UAR.

Published in: 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII)

Date of Conference: 23-26 October 2017

Date Added to IEEE Xplore: 01 February 2018

ISBN Information:

Electronic ISSN: 2156-8111

DOI: 10.1109/ACII.2017.8273648

Conference Location: San Antonio, TX, USA

Contents

References is not available for this document.

Modeling variable length phoneme sequences — A step towards linguistic information for speech emotion recognition in wider world

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Modeling variable length phoneme sequences — A step towards linguistic information for speech emotion recognition in wider world

Alerts

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?