Research on sound classification based on SVM

Wei, Pengcheng; He, Fangcheng; Li, Li; Li, Jing

doi:10.1007/s00521-019-04182-0

Research on sound classification based on SVM

Deep Learning for Big Data Analytics
Published: 16 April 2019

Volume 32, pages 1593–1607, (2020)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Pengcheng Wei¹,
Fangcheng He²,
Li Li¹ &
…
Jing Li¹

920 Accesses
17 Citations
Explore all metrics

Abstract

Sound is a ubiquitous natural phenomenon that contains a wealth of information that constantly enhances our understanding of the objective world. With the continuous development of computer network technology and communication technology, audio information has become a very important part. Audio is a non-semantic symbolic representation and an unstructured binary stream. Because the audio itself lacks the description of content semantics and structured organization, it brings great difficulty to the audio classification work. The research of digital audio classification will become more and more important with the increasing number of digital audio resources in the network. Digital audio classification technology is the key technology to solve this problem. It is the key to solve the problem of audio structure and extract audio structured information and content semantics. It is a research hot spot in the field of audio analysis. It has important application value in many fields, such as audio retrieval, video summary and auxiliary video analysis. This paper studies the structure of audio, the analysis and extraction of audio features, the digital audio classifier based on support vector machines (SVM) and the audio segmentation technology based on BCI. SVM is an important achievement of machine learning research in recent years. As a new machine learning method, SVM can solve practical problems such as small sample, nonlinearity and high dimension, so it has become a new research hot spot after the study of neural network. Experiments show that the SVM-based audio classification algorithm has good classification effect, and the smoothed audio segmentation results are more accurate. With the further development of the research, the research results will be well applied in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Article Open access 03 January 2024

Content-Based Audio Classification and Retrieval Using Segmentation, Feature Extraction and Neural Network Approach

Automatic Classification of Music Genre Using SVM

References

Vapnik V (1995) The nature of statistical learning theory. Springer, New York
Book Google Scholar
Zhang T (2001) Audio content analysis for online audiovisual data segmentation and classification. IEEE Trans Speech Audio Process 96(4):440457
Google Scholar
Kumar M, Mao YH, Wang YH et al (2017) Fuzzy theoretic approach to signals and systems: static systems. Inf Sci 418:668–702
Article Google Scholar
Zhang WP, Yang JZ, Fang YL et al (2017) Analytical fuzzy approach to biological data analysis. Saudi J Biol Sci 24(3):563–573
Article Google Scholar
Duda RO, Hart PE, Stork DG (2001) Pattern classification, vol 2. Wiley, New York
MATH Google Scholar
Molla Md.KI, Hirose K (2004) On the effectiveness of MFCCs and their statistical distribution properties in speaker identification. In: IEEE international conference on virtual environments, human–computer interfaces and measurement systems, pp 136–141
Picone JW (1976) Signal modeling techniques in speech recognition. Proc IEEE 79(4):157–161
Google Scholar
Zhou B, Hansen JH (2005) Efficient audio stream segmentation via the combined T2 statistic and Bayesian information criterion. IEEE Trans Speech Audio Process 13(4):467
Article Google Scholar
Seheirer E, Slaney M (1997, April) Construction and evaluation of a robust multifeature music/speech discriminator. In: Proceedings of ICASSP 97
Vernstrom T, Gaensler BM, Brown S et al (2017) Low frequency radio constraints on the synchrotron cosmic web. Mon Not R Astron Soc 467(4):4914–4936
Article Google Scholar
Reynolds DA, Rose RC (1995) Text-independent speaker identification using Gaussian mixture speaker models. In: IEEE Transaction on SAP, pp 72–83
Li SZ (2000) Content-Based classification and retrieval of audio using the nearest feature line method. IEEE Trans Speech Audio Process 8(5):619–625
Article MathSciNet Google Scholar
Feiten B, Frank R, Ungvary T (1991) Organization of sounds with neural nets. In: Proceedings of the 1991 international computer music conference. International computer music association, San Francisco, pp 441–444
Liang B, Yaali H, Songyang L, Jianyun C, Lingda W (2004) Feature analysis and extraction for audio automatic classification. In: The International workshop on image, video, audio retrieval and mining, Canada
Lu L, Jiang H, Zhang HJ (2001) A robust audio classification and segmentation method. In: Proceedings of the 9th ACM international conference on multimedia, pp 203–211
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
MATH Google Scholar
Shirvani A, Chegini H, Setayeshi S et al. (2009) Polynomial kernel function and its application in locally polynomial neurofuzzy models. In: International CSI computer conference. IEEE, pp 54–59
Vapnik VN (1998) Statistical learning theory. Wiley, New York
MATH Google Scholar
Kim H, Elter D, Sikora T (2005) Hybrid speaker-based segmentation system using model-level clustering. In: Proceedings of the IEEE international conference onacoustics speech, and signal processing, pp 745–748
Chen S, Gopalakrishnan PS (1998) Speaker, environment and channel change detection and clustering via the Bayesian information criterion. In: Proceedings of the speech recognition workshop
L Lu, H-J Zhang (2002) Real-time unsupervised speaker change detection. In: 6th International conference on pattern recognition, pp 358–361
Cheng SS, Wang HM, Fu HC (2008) BIC-based audio segmentation by divide and conquer. In: Proceedings of ICASSP 2008. IEEE Press, Las Vegas, pp 4841–4844
Chen S, Gopalakrishnan R (1998) Speaker environment and channel change detection and clustering via the bayesian information criterion. In: Proceedings of DARPA broadecast news transcription and understanding workshop, Lansdowne, VA, USA, pp 127–132
Cettolo M, Vescovi M. (2003) Efficient audio segmentation algorithms based on the BIC. In: Proceedings of the international conference on acoustics, speech, and signal processing, Hong Kong, China, pp 537–540

Download references

Acknowledgements

This work was supported by Chongqing Big Data Engineering Laboratory for Children, Chongqing Electronics Engineering Technology Research Center for Interactive Learning, the Science and Technology Research Project of Chongqing Municipal Education Commission of China (No. KJ1601401), the Science and Technology Research Project of Chongqing University of Education (No. KY201725C), Basic Research and Frontier Exploration of Chongqing Science and Technology Commission (CSTC2014jcyjA40019), Project of Science and Technology Research Program of Chongqing Education Commission of China (N0. KJZD-K201801601).

Author information

Authors and Affiliations

School of Mathematics and Information Engineering, Chongqing University of Education at Nanshan, No. 1 Chongjiao Rd. Nanshan Avenue, Nanan District, Chongqing, China
Pengcheng Wei, Li Li & Jing Li
College of Foreign Languages Literature, Chongqing University of Education at Nanshan, No. 1 Chongjiao Rd. Nanshan Avenue, Nanan District, Chongqing, China
Fangcheng He

Authors

Pengcheng Wei
View author publications
You can also search for this author in PubMed Google Scholar
Fangcheng He
View author publications
You can also search for this author in PubMed Google Scholar
Li Li
View author publications
You can also search for this author in PubMed Google Scholar
Jing Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pengcheng Wei.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wei, P., He, F., Li, L. et al. Research on sound classification based on SVM. Neural Comput & Applic 32, 1593–1607 (2020). https://doi.org/10.1007/s00521-019-04182-0

Download citation

Received: 20 January 2019
Accepted: 29 March 2019
Published: 16 April 2019
Issue Date: March 2020
DOI: https://doi.org/10.1007/s00521-019-04182-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Research on sound classification based on SVM

Abstract

Access this article

Similar content being viewed by others

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Content-Based Audio Classification and Retrieval Using Segmentation, Feature Extraction and Neural Network Approach

Automatic Classification of Music Genre Using SVM

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Research on sound classification based on SVM

Abstract

Access this article

Similar content being viewed by others

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Content-Based Audio Classification and Retrieval Using Segmentation, Feature Extraction and Neural Network Approach

Automatic Classification of Music Genre Using SVM

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation