Speech/Music Classification Enhancement for 3GPP2 SMV Codec Based on Deep Belief Networks

Ji-Hyun SONG; Hong-Sub AN; Sangmin LEE

doi:10.1587/transfun.E97.A.661

Abstract

In this paper, we propose a robust speech/music classification algorithm to improve the performance of speech/music classification in the selectable mode vocoder (SMV) of 3GPP2 using deep belief networks (DBNs), which is a powerful hierarchical generative model for feature extraction and can determine the underlying discriminative characteristic of the extracted features. The six feature vectors selected from the relevant parameters of the SMV are applied to the visible layer in the proposed DBN-based method. The performance of the proposed algorithm is evaluated using the detection accuracy and error probability of speech and music for various music genres. The proposed algorithm yields better results when compared with the original SMV method and support vector machine (SVM) based method.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!