Abstract:
Cross-corpus speech emotion recognition has attracted much attention due to the widespread existence of various emotional speech in life. It takes one corpus for training...Show MoreMetadata
Abstract:
Cross-corpus speech emotion recognition has attracted much attention due to the widespread existence of various emotional speech in life. It takes one corpus for training and another corpus for testing, and generally involves the following two basic problems: the corpus-invariant feature representation and relevance across different corpora. To deal with these two problems, we propose a novel transfer learning method called transfer sparse discriminant subspace learning (TSDSL) in this article. Specifically, to solve the first problem, we learn a common feature subspace of different corpora by introducing the discriminative learning and ℓ2,1-norm penalty, which can learn the most discriminative features across different corpora. To address the second problem, we construct a novel nearest neighbor graph as the distance metric, in which the similarity between different corpora can be measured simultaneously. Extensive experiments are carried out on cross-corpus speech emotion recognition tasks, and the results show that our method can achieve competitive performance compared with state-of-the-art algorithms.
Published in: IEEE/ACM Transactions on Audio, Speech, and Language Processing ( Volume: 28)