Loading [a11y]/accessibility-menu.js
Fusion of classifier predictions for audio-visual emotion recognition | IEEE Conference Publication | IEEE Xplore
Scheduled Maintenance: On Tuesday, 25 February, IEEE Xplore will undergo scheduled maintenance from 1:00-5:00 PM ET (1800-2200 UTC). During this time, there may be intermittent impact on performance. We apologize for any inconvenience.

Fusion of classifier predictions for audio-visual emotion recognition


Abstract:

In this paper is presented a novel multimodal emotion recognition system which is based on the analysis of audio and visual cues. MFCC-based features are extracted from t...Show More

Abstract:

In this paper is presented a novel multimodal emotion recognition system which is based on the analysis of audio and visual cues. MFCC-based features are extracted from the audio channel and facial landmark geometric relations are computed from visual data. Both sets of features are learnt separately using state-of-the-art classifiers. In addition, we summarise each emotion video into a reduced set of key-frames, which are learnt in order to visually discriminate emotions by means of a Convolutional Neural Network. Finally, confidence outputs of all classifiers from all modalities are used to define a new feature space to be learnt for final emotion prediction, in a late fusion/stacking fashion. The conducted experiments on eNTERFACE'05 database show significant performance improvements of our proposed system in comparison to state-of-the-art approaches.
Date of Conference: 04-08 December 2016
Date Added to IEEE Xplore: 24 April 2017
ISBN Information:
Conference Location: Cancun, Mexico

References

References is not available for this document.