Abstract
Musical genres are categorical labels created by humans to characterize pieces of music. These labels may be highly subjective but typically are related to the instrumentation, rhythmic structure, and harmonic content of the music. In this paper, we propose a model for music genre classification. The new model is referred to as the bag-of-tones (BOT) model which follows the conceptually similar idea of the bag-of-words (BOW) model in natural language processing and the bag-of-feature (BOF) model in image processing. The basic low-level music features such as Mel-frequency cepstral coefficients (MFCC) are clustered into a set of codewords referred to as “tones”. By using such a model, each piece of music can be represented by a new feature vector of distribution on tones. Classical machine learning models such as support vector machines (SVM) can be applied for genre classification. The model is tested using two datasets. We found that the polynomial kernel function has the best performance in the SVM classification. By comparing to the previous work, we found the new proposed model outperform classical models on a given benchmark dataset. In general, this model can be used to structure the large collections of music available on the Web. It can play an important role in automatic digital music categorization and retrieval.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Dannenberg, R.B., Thom, B., Watson, D.: A machine learning approach to musical style recognition. In: Proc. International Computer Music Conference (1997)
Chai, W., Barry, V.: Folk music classification using hidden Markov models. In: Proceedings of International Conference on Artificial Intelligence, vol. 6 (2001)
Shan, M.K., Kuo, F.-F.: Music style mining and classification by melody. IEICE Transactions on Information and Systems 86(3), 655–659 (2003)
Matityaho, B., Furst, M.: Neural network based model for classification of music type. In: Eighteenth Convention of Electrical and Electronics Engineers in Israel. IEEE (1995)
Han, K.-P., Park, Y.-S., Jeon, S.-G., Lee, G.-C.: Genre classification system of TV sound signals based on a spectrogram analysis. IEEE Transactions on Consumer Electronics 44(1), 33–42 (1998)
Pye, D.: Content-based methods for the management of digital music. In: Proceedings of the 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 6. IEEE (2000)
Jiang, D.N., Lu, L., Zhang, H.J., Tao, J.-H.: Music type classification by spectral contrast feature. In: Proceedings of the IEEE International Conference on Multimedia and Expo, ICME 2002, vol. 1. IEEE (2002)
Liu, N.H.: Comparison of content-based music recommendation using different distance estimation methods. Applied Intelligence 38(2), 160–174 (2013)
Logan, B.: Mel frequency cepstral coefficients for music modeling. In: MUSIC IR (2000)
Qin, Z., Thint, M., Huang, Z.: Ranking answers by hierarchical topic models. In: Chien, B.-C., Hong, T.-P., Chen, S.-M., Ali, M. (eds.) IEA/AIE 2009. LNCS (LNAI), vol. 5579, pp. 103–112. Springer, Heidelberg (2009)
Zhao, Q., Qin, Z., Wan, T.: What is the Basic Semantic Unit of Chinese Language? A Computational Approach Based on Topic Models. In: Kanazawa, M., Kornai, A., Kracht, M., Seki, H. (eds.) MOL 12. LNCS (LNAI), vol. 6878, pp. 143–157. Springer, Heidelberg (2011)
Zhao, Q., Qin, Z., Wan, T.: Topic modeling of Chinese language using character-word relations. In: Lu, B.-L., Zhang, L., Kwok, J. (eds.) ICONIP 2011, Part III. LNCS, vol. 7064, pp. 139–147. Springer, Heidelberg (2011)
Yuan, X., Yu, J., Qin, Z., Wan, T.: A bag-of-features model with integrated SIFT-LBP features for content-based image retrieval. In: Proceedings of the International Conference on Image Processing, pp. 1061–1064 (2011)
Yu, J., Qin, Z., Wan, T., Zhang, X.: Feature integration analysis of bag-of-features model for image retrieval. Neurocomputing 120, 355–364 (2013)
Lie, L., Jiang, H., Zhang, H.: A robust audio classification and segmentation method. In: Proceedings of the Ninth ACM International Conference on Multimedia (2001)
Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing 10, 293–302 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Qin, Z., Liu, W., Wan, T. (2013). A Bag-of-Tones Model with MFCC Features for Musical Genre Classification. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds) Advanced Data Mining and Applications. ADMA 2013. Lecture Notes in Computer Science(), vol 8346. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53914-5_48
Download citation
DOI: https://doi.org/10.1007/978-3-642-53914-5_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-53913-8
Online ISBN: 978-3-642-53914-5
eBook Packages: Computer ScienceComputer Science (R0)