Abstract
Content-based audio signal classification into broad categories such as speech, music, or speech with noise is the first step before any further processing such as speech recognition, content-based indexing, or surveillance systems. In this paper, we propose an efficient content-based audio classification approach to classify audio signals into broad genres using a fuzzy c-means (FCM) algorithm. We analyze different characteristic features of audio signals in time, frequency, and coefficient domains and select the optimal feature vector by employing a noble analytical scoring method to each feature. We utilize an FCM-based classification scheme and apply it on the extracted normalized optimal feature vector to achieve an efficient classification result. Experimental results demonstrate that the proposed approach outperforms the existing state-of-the-art audio classification systems by more than 11% in classification performance.
Similar content being viewed by others
References
Casey MA, Veltkamp R, Goto M, Leman M, Rhodes C, Slaney M (2008) Content-based music information retrieval: current directions and future challenges. Proc IEEE 96(4):668–696
Chen L, Gunduz S, Ozsu MT (2006) Mixed type audio classification with support vector machine. IEEE Int Conf Multimedia and Expo:781–784
Jian-bin L, Ji-kun Y, Hui Z, Zhong-xia N (2005) AdaBoost for a digital speech recorder in different short wave environments. Int Symposium Test Measurement 9:8414–8418
Jian-bin L, Ji-kun Y, Hui Z, Zhong-xia N (2006) Two-stage speech/non-speech classification of telephone signals. Proc Int Conf Commun Circuits Systems 1:490–492
Khan MKS, Khatib WGA (2006) Machine-learning based classification of speech and music. Multimedia Systems 12(1):55–67
Kim KM, Kim SY, Jeon JK, Park KS (2006) Quick audio retrieval using multiple feature vectors. IEEE Trans Consumer Electronics 52(1):200–204
Krishnamoorthy P, Kumar S (2010) Hierarchical audio content classification system using an optimal feature selection algorithm. Multimed Tool Appl. doi:10.1007/s11042-010-0546-7
Langlois T, Marques G (2009) Automatic music genre classification using a hierarchical clustering and a language model approach. Proc Int Conf Advances Multimedia:188–193
Lee CH, Shih JL, Yu KM, Lin HS (2009) Automatic music genre classification based on modulation spectral analysis of spectral and cepstral features. IEEE Trans Multimedia 11(4):670–682
Li W, Liu Y, Xue X (2010) Robust audio identification for MP3 popular music. Proc Int ACM SIGIR Conf Research and Development in Information Retrieval :627–634
Liu M, Wan C, Wang L (2002) Content-based audio classification and retrieval using a fuzzy logic system: towards multimedia search engines. Soft Computing-A Fusion of Foundations, Methodologies and Applications 6(5):357–364
Lopes M, Gouyon F, Koerich AL, Oliveira LES (2010) Selection of training instances for music genre classification. Int Conf Pattern Recognition:4569–4572
Lu L, Li SZ, Zhang HJ (2001) Context-based audio segmentation using support vector machines. IEEE Int Conf Multimedia and Expo:191–194
Lu L, Zhang HJ, Jiang H (2002) Content analysis for audio classification and segmentation. IEEE Trans Speech Audio Process 10(7):504–516
Lu L, Zhang HJ, Li SZ (2003) Content-based audio classification and segmentation by using support vector machines. Multimedia Systems 8(6):482–492
Luong HV, Kim CH, Kim JM (2009) Classification of audio signals using generalized spatial fuzzy clustering. J Acoust Soc Am 125:2699
Mirceva G, Mirchev M, Davcev D (2010) Hidden Markov models for classifying protein secondary and tertiary structures. J Convergence 1(1):57–64
Mitra V, Wang CJ (2007) A neural network based audio content classification. Proc Int Joint Conf Neural Networks: 1494–1499
Mohammad AH, Kim JM (2011) An enhanced fuzzy c-means algorithm for audio segmentation and classification. Multimed Tool Appl. doi:10.1007/s11042-011-0921-z
Nguyen NTT, Mohammad AH, Kim CH, Kim JM (2011) Audio segmentation and classification using a temporally weighted fuzzy c-means algorithm. Int Symposium on Neural Networks. (Accepted)
Nitanda N, Haseyama M, Kitajima H (2006) Audio signal segmentation and classification using fuzzy c-means clustering. Systems and Computers in Japan 37(4):23–34
Park DC (2006) Modeling and classification of audio signals using gradient-based fuzzy c-means algorithm with a Mercer Kernel. LNCS. Springer 4099:1104–1108
Park DC (2009) Classification of audio signals using fuzzy c-means with divergence-based Kernel. Pattern Recogn Lett 30(9):794–798
Park DC (2011) Content-based retrieval of audio data using a centroid neural network. IEEE Int Symp Signal Processing and Information Technology:394–398
Park DC, Nguyen DH, Beack SH, Park S (2005) Classification of audio signals using gradient-based fuzzy c-means algorithm with divergence measure. LNCS. Springer 3767:698–708
Popescu A, Gavat I, Datcu M (2009) Wavelet analysis for audio signal with music classification applications. Proc Conf Speech Technology and Human-Computer Dialogue:1–6
Pyshkin E, Kuznetsov A (2010) Approaches for web search user interfaces: how to improve the search quality for various types of information. J Convergence 1(1):1–8
Saunders J (1996) Real time discrimination of broadcast speech/music. Proc Int Conf Acoustics, Speech, Signal Processing:993–996
Simsekli U (2010) Automatic music genre classification using bass lines. Int Conf Pattern Recognition:4137–4140
Theodoridis S, Koutroumbas K (2009) Pattern recognition. Academic Press. Fourth Edition. ISBN: 978-1-59749-272-0
Turnbull D, Elkan C (2005) Fast recognition of musical genres using RBF networks. IEEE Trans Knowledge Data Engineering 17(4):580–584
Tzagkarakis C, Mouchtaris A, Tsakalides P (2006) Musical genre classification via generalized gaussian and alpha-stable modeling. IEEE Int Conf Acoustics Speech Signal Process 5:217–220
Tzanetakis G, Cook P (2002) Musical genre classification of audio signals. IEEE Trans Speech Audio Process 10(5):293–302
Vitaly K, Vladimir O (2011) Semantic retrieval: an approach to representing, searching and summarising text documents. Int J Inf Technol Commun Convergence 1(2):221–234
Wang JC, Wang JF, Lin CB, Jian KT, Kuok WH (2006) Content-based audio classification using support vector machines and independent component analysis. Proc Int Conf Pattern Recogn 4:157–160
Wold E, Blum T, Keislar D, Wheaton J (1996) Content-based classification, search, and retrieval of audio. IEEE Multimedia Magazine 3(3):27–36
Yunming Y, Xutao L, Biao W, Yan L (2011) A comparative study of feature weighting methods for document co-clustering. Int J Inf Technol Commun Convergence 1(2):206–220
Zhang T, Kuo CCJ (2001) Audio content analysis for online audiovisual data segmentation and classification. IEEE Trans Speech Audio Process 9(4):441–457
Zhu Y, Ming Z, Huang Q (2007) SVM-based audio classification for content-based multimedia retrieval. LNCS. Springer 4577:474–482
Acknowledgements
This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MEST) (No. 2011–0017941)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Haque, M.A., Kim, JM. An analysis of content-based classification of audio signals using a fuzzy c-means algorithm. Multimed Tools Appl 63, 77–92 (2013). https://doi.org/10.1007/s11042-012-1019-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-012-1019-y