Skip to main content
Log in

An analysis of content-based classification of audio signals using a fuzzy c-means algorithm

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Content-based audio signal classification into broad categories such as speech, music, or speech with noise is the first step before any further processing such as speech recognition, content-based indexing, or surveillance systems. In this paper, we propose an efficient content-based audio classification approach to classify audio signals into broad genres using a fuzzy c-means (FCM) algorithm. We analyze different characteristic features of audio signals in time, frequency, and coefficient domains and select the optimal feature vector by employing a noble analytical scoring method to each feature. We utilize an FCM-based classification scheme and apply it on the extracted normalized optimal feature vector to achieve an efficient classification result. Experimental results demonstrate that the proposed approach outperforms the existing state-of-the-art audio classification systems by more than 11% in classification performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Casey MA, Veltkamp R, Goto M, Leman M, Rhodes C, Slaney M (2008) Content-based music information retrieval: current directions and future challenges. Proc IEEE 96(4):668–696

    Article  Google Scholar 

  2. Chen L, Gunduz S, Ozsu MT (2006) Mixed type audio classification with support vector machine. IEEE Int Conf Multimedia and Expo:781–784

  3. Jian-bin L, Ji-kun Y, Hui Z, Zhong-xia N (2005) AdaBoost for a digital speech recorder in different short wave environments. Int Symposium Test Measurement 9:8414–8418

    Google Scholar 

  4. Jian-bin L, Ji-kun Y, Hui Z, Zhong-xia N (2006) Two-stage speech/non-speech classification of telephone signals. Proc Int Conf Commun Circuits Systems 1:490–492

    Google Scholar 

  5. Khan MKS, Khatib WGA (2006) Machine-learning based classification of speech and music. Multimedia Systems 12(1):55–67

    Article  Google Scholar 

  6. Kim KM, Kim SY, Jeon JK, Park KS (2006) Quick audio retrieval using multiple feature vectors. IEEE Trans Consumer Electronics 52(1):200–204

    MathSciNet  Google Scholar 

  7. Krishnamoorthy P, Kumar S (2010) Hierarchical audio content classification system using an optimal feature selection algorithm. Multimed Tool Appl. doi:10.1007/s11042-010-0546-7

  8. Langlois T, Marques G (2009) Automatic music genre classification using a hierarchical clustering and a language model approach. Proc Int Conf Advances Multimedia:188–193

  9. Lee CH, Shih JL, Yu KM, Lin HS (2009) Automatic music genre classification based on modulation spectral analysis of spectral and cepstral features. IEEE Trans Multimedia 11(4):670–682

    Article  Google Scholar 

  10. Li W, Liu Y, Xue X (2010) Robust audio identification for MP3 popular music. Proc Int ACM SIGIR Conf Research and Development in Information Retrieval :627–634

  11. Liu M, Wan C, Wang L (2002) Content-based audio classification and retrieval using a fuzzy logic system: towards multimedia search engines. Soft Computing-A Fusion of Foundations, Methodologies and Applications 6(5):357–364

    MATH  Google Scholar 

  12. Lopes M, Gouyon F, Koerich AL, Oliveira LES (2010) Selection of training instances for music genre classification. Int Conf Pattern Recognition:4569–4572

  13. Lu L, Li SZ, Zhang HJ (2001) Context-based audio segmentation using support vector machines. IEEE Int Conf Multimedia and Expo:191–194

  14. Lu L, Zhang HJ, Jiang H (2002) Content analysis for audio classification and segmentation. IEEE Trans Speech Audio Process 10(7):504–516

    Article  Google Scholar 

  15. Lu L, Zhang HJ, Li SZ (2003) Content-based audio classification and segmentation by using support vector machines. Multimedia Systems 8(6):482–492

    Article  Google Scholar 

  16. Luong HV, Kim CH, Kim JM (2009) Classification of audio signals using generalized spatial fuzzy clustering. J Acoust Soc Am 125:2699

    Google Scholar 

  17. Mirceva G, Mirchev M, Davcev D (2010) Hidden Markov models for classifying protein secondary and tertiary structures. J Convergence 1(1):57–64

    Google Scholar 

  18. Mitra V, Wang CJ (2007) A neural network based audio content classification. Proc Int Joint Conf Neural Networks: 1494–1499

  19. Mohammad AH, Kim JM (2011) An enhanced fuzzy c-means algorithm for audio segmentation and classification. Multimed Tool Appl. doi:10.1007/s11042-011-0921-z

  20. Nguyen NTT, Mohammad AH, Kim CH, Kim JM (2011) Audio segmentation and classification using a temporally weighted fuzzy c-means algorithm. Int Symposium on Neural Networks. (Accepted)

  21. Nitanda N, Haseyama M, Kitajima H (2006) Audio signal segmentation and classification using fuzzy c-means clustering. Systems and Computers in Japan 37(4):23–34

    Article  Google Scholar 

  22. Park DC (2006) Modeling and classification of audio signals using gradient-based fuzzy c-means algorithm with a Mercer Kernel. LNCS. Springer 4099:1104–1108

    Google Scholar 

  23. Park DC (2009) Classification of audio signals using fuzzy c-means with divergence-based Kernel. Pattern Recogn Lett 30(9):794–798

    Article  Google Scholar 

  24. Park DC (2011) Content-based retrieval of audio data using a centroid neural network. IEEE Int Symp Signal Processing and Information Technology:394–398

  25. Park DC, Nguyen DH, Beack SH, Park S (2005) Classification of audio signals using gradient-based fuzzy c-means algorithm with divergence measure. LNCS. Springer 3767:698–708

    Google Scholar 

  26. Popescu A, Gavat I, Datcu M (2009) Wavelet analysis for audio signal with music classification applications. Proc Conf Speech Technology and Human-Computer Dialogue:1–6

  27. Pyshkin E, Kuznetsov A (2010) Approaches for web search user interfaces: how to improve the search quality for various types of information. J Convergence 1(1):1–8

    Google Scholar 

  28. Saunders J (1996) Real time discrimination of broadcast speech/music. Proc Int Conf Acoustics, Speech, Signal Processing:993–996

  29. Simsekli U (2010) Automatic music genre classification using bass lines. Int Conf Pattern Recognition:4137–4140

  30. Theodoridis S, Koutroumbas K (2009) Pattern recognition. Academic Press. Fourth Edition. ISBN: 978-1-59749-272-0

  31. Turnbull D, Elkan C (2005) Fast recognition of musical genres using RBF networks. IEEE Trans Knowledge Data Engineering 17(4):580–584

    Article  Google Scholar 

  32. Tzagkarakis C, Mouchtaris A, Tsakalides P (2006) Musical genre classification via generalized gaussian and alpha-stable modeling. IEEE Int Conf Acoustics Speech Signal Process 5:217–220

    Google Scholar 

  33. Tzanetakis G, Cook P (2002) Musical genre classification of audio signals. IEEE Trans Speech Audio Process 10(5):293–302

    Article  Google Scholar 

  34. Vitaly K, Vladimir O (2011) Semantic retrieval: an approach to representing, searching and summarising text documents. Int J Inf Technol Commun Convergence 1(2):221–234

    Article  Google Scholar 

  35. Wang JC, Wang JF, Lin CB, Jian KT, Kuok WH (2006) Content-based audio classification using support vector machines and independent component analysis. Proc Int Conf Pattern Recogn 4:157–160

    Google Scholar 

  36. Wold E, Blum T, Keislar D, Wheaton J (1996) Content-based classification, search, and retrieval of audio. IEEE Multimedia Magazine 3(3):27–36

    Article  Google Scholar 

  37. Yunming Y, Xutao L, Biao W, Yan L (2011) A comparative study of feature weighting methods for document co-clustering. Int J Inf Technol Commun Convergence 1(2):206–220

    Article  Google Scholar 

  38. Zhang T, Kuo CCJ (2001) Audio content analysis for online audiovisual data segmentation and classification. IEEE Trans Speech Audio Process 9(4):441–457

    Article  Google Scholar 

  39. Zhu Y, Ming Z, Huang Q (2007) SVM-based audio classification for content-based multimedia retrieval. LNCS. Springer 4577:474–482

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MEST) (No. 2011–0017941)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jong-Myon Kim.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Haque, M.A., Kim, JM. An analysis of content-based classification of audio signals using a fuzzy c-means algorithm. Multimed Tools Appl 63, 77–92 (2013). https://doi.org/10.1007/s11042-012-1019-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-012-1019-y

Keywords

Navigation