Content-based audio classification and segmentation by using support vector machines

Lu, Lie; Zhang, Hong-Jiang; Li, Stan Z.

doi:10.1007/s00530-002-0065-0

Content-based audio classification and segmentation by using support vector machines

Regular paper
Published: April 2003

Volume 8, pages 482–492, (2003)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Lie Lu¹,
Hong-Jiang Zhang¹ &
Stan Z. Li¹

850 Accesses
151 Citations
9 Altmetric
Explore all metrics

Abstract.

Content-based audio classification and segmentation is a basis for further audio/video analysis. In this paper, we present our work on audio segmentation and classification which employs support vector machines (SVMs). Five audio classes are considered in this paper: silence, music, background sound, pure speech, and non- pure speech which includes speech over music and speech over noise. A sound stream is segmented by classifying each sub-segment into one of these five classes. We have evaluated the performance of SVM on different audio type-pairs classification with testing unit of different- length and compared the performance of SVM, K-Nearest Neighbor (KNN), and Gaussian Mixture Model (GMM). We also evaluated the effectiveness of some new proposed features. Experiments on a database composed of about 4- hour audio data show that the proposed classifier is very efficient on audio classification and segmentation. It also shows the accuracy of the SVM-based method is much better than the method based on KNN and GMM.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Author information

Authors and Affiliations

Microsoft Research Asia 5F Beijing Sigma Center, No.49 Zhichun Road Hai Dian District, Beijing, 100080, China (e-mail: {llu,hjzhang,szli}@microsoft.com) , , , , , , CN
Lie Lu, Hong-Jiang Zhang & Stan Z. Li

Authors

Lie Lu
View author publications
You can also search for this author in PubMed Google Scholar
Hong-Jiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Stan Z. Li
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, L., Zhang, HJ. & Li, S. Content-based audio classification and segmentation by using support vector machines. Multimedia Systems 8, 482–492 (2003). https://doi.org/10.1007/s00530-002-0065-0

Download citation

Issue Date: April 2003
DOI: https://doi.org/10.1007/s00530-002-0065-0

Key words: Audio content analysis, audio classification and segmentation, support vector machines

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Content-based audio classification and segmentation by using support vector machines

Abstract.

Access this article

Similar content being viewed by others

Survey on SVM and their application in image classification

A Survey on Supervised and Unsupervised Learning Techniques

Speech Emotion Recognition: A Comprehensive Survey

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Content-based audio classification and segmentation by using support vector machines

Abstract.

Access this article

Similar content being viewed by others

Survey on SVM and their application in image classification

A Survey on Supervised and Unsupervised Learning Techniques

Speech Emotion Recognition: A Comprehensive Survey

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation