IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Online ISSN : 1745-1337
Print ISSN : 0916-8508
Regular Section
Speech Emotion Detection Using Fusion on Multi-Source Low-Level Information Based Recurrent Branches
Jiaxin WUBing LILi ZHAOXinzhou XU
Author information
JOURNAL FREE ACCESS

2024 Volume E107.A Issue 11 Pages 1641-1649

Details
Abstract

The task of Speech Emotion Detection (SED) aims at judging positive class and negetive class when the speaker expresses emotions. The SED performances are heavily dependent on the diversity and prominence of emotional features extracted from the speech. However, most of the existing related research focuses on investigating the effects of single feature source and hand-crafted features. Thus, we propose a SED approach using multi-source low-level information based recurrent branches. The fusion multi-source low-level information obtain variety and discriminative representations from speech emotion signals. In addition, focal-loss function benifit for imbalance classes, resulting in reducing the proportion of well-classified samples and increasing the weights for difficult samples on SED tasks. Experiments on IEMOCAP corpus demonstrate the effectiveness of the proposed method. Compared with the baselines, MSIR achieve the significant performance improvements in terms of Unweighted Average Recall and F1-score.

References (59)
Content from these authors
© 2024 The Institute of Electronics, Information and Communication Engineers
Next article
feedback
Top