Skip to main content

A Bag-of-Tones Model with MFCC Features for Musical Genre Classification

  • Conference paper
Advanced Data Mining and Applications (ADMA 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8346))

Included in the following conference series:

Abstract

Musical genres are categorical labels created by humans to characterize pieces of music. These labels may be highly subjective but typically are related to the instrumentation, rhythmic structure, and harmonic content of the music. In this paper, we propose a model for music genre classification. The new model is referred to as the bag-of-tones (BOT) model which follows the conceptually similar idea of the bag-of-words (BOW) model in natural language processing and the bag-of-feature (BOF) model in image processing. The basic low-level music features such as Mel-frequency cepstral coefficients (MFCC) are clustered into a set of codewords referred to as “tones”. By using such a model, each piece of music can be represented by a new feature vector of distribution on tones. Classical machine learning models such as support vector machines (SVM) can be applied for genre classification. The model is tested using two datasets. We found that the polynomial kernel function has the best performance in the SVM classification. By comparing to the previous work, we found the new proposed model outperform classical models on a given benchmark dataset. In general, this model can be used to structure the large collections of music available on the Web. It can play an important role in automatic digital music categorization and retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dannenberg, R.B., Thom, B., Watson, D.: A machine learning approach to musical style recognition. In: Proc. International Computer Music Conference (1997)

    Google Scholar 

  2. Chai, W., Barry, V.: Folk music classification using hidden Markov models. In: Proceedings of International Conference on Artificial Intelligence, vol. 6 (2001)

    Google Scholar 

  3. Shan, M.K., Kuo, F.-F.: Music style mining and classification by melody. IEICE Transactions on Information and Systems 86(3), 655–659 (2003)

    Google Scholar 

  4. Matityaho, B., Furst, M.: Neural network based model for classification of music type. In: Eighteenth Convention of Electrical and Electronics Engineers in Israel. IEEE (1995)

    Google Scholar 

  5. Han, K.-P., Park, Y.-S., Jeon, S.-G., Lee, G.-C.: Genre classification system of TV sound signals based on a spectrogram analysis. IEEE Transactions on Consumer Electronics 44(1), 33–42 (1998)

    Article  Google Scholar 

  6. Pye, D.: Content-based methods for the management of digital music. In: Proceedings of the 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 6. IEEE (2000)

    Google Scholar 

  7. Jiang, D.N., Lu, L., Zhang, H.J., Tao, J.-H.: Music type classification by spectral contrast feature. In: Proceedings of the IEEE International Conference on Multimedia and Expo, ICME 2002, vol. 1. IEEE (2002)

    Google Scholar 

  8. Liu, N.H.: Comparison of content-based music recommendation using different distance estimation methods. Applied Intelligence 38(2), 160–174 (2013)

    Article  Google Scholar 

  9. Logan, B.: Mel frequency cepstral coefficients for music modeling. In: MUSIC IR (2000)

    Google Scholar 

  10. Qin, Z., Thint, M., Huang, Z.: Ranking answers by hierarchical topic models. In: Chien, B.-C., Hong, T.-P., Chen, S.-M., Ali, M. (eds.) IEA/AIE 2009. LNCS (LNAI), vol. 5579, pp. 103–112. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  11. Zhao, Q., Qin, Z., Wan, T.: What is the Basic Semantic Unit of Chinese Language? A Computational Approach Based on Topic Models. In: Kanazawa, M., Kornai, A., Kracht, M., Seki, H. (eds.) MOL 12. LNCS (LNAI), vol. 6878, pp. 143–157. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  12. Zhao, Q., Qin, Z., Wan, T.: Topic modeling of Chinese language using character-word relations. In: Lu, B.-L., Zhang, L., Kwok, J. (eds.) ICONIP 2011, Part III. LNCS, vol. 7064, pp. 139–147. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  13. Yuan, X., Yu, J., Qin, Z., Wan, T.: A bag-of-features model with integrated SIFT-LBP features for content-based image retrieval. In: Proceedings of the International Conference on Image Processing, pp. 1061–1064 (2011)

    Google Scholar 

  14. Yu, J., Qin, Z., Wan, T., Zhang, X.: Feature integration analysis of bag-of-features model for image retrieval. Neurocomputing 120, 355–364 (2013)

    Article  Google Scholar 

  15. Lie, L., Jiang, H., Zhang, H.: A robust audio classification and segmentation method. In: Proceedings of the Ninth ACM International Conference on Multimedia (2001)

    Google Scholar 

  16. Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing 10, 293–302 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Qin, Z., Liu, W., Wan, T. (2013). A Bag-of-Tones Model with MFCC Features for Musical Genre Classification. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds) Advanced Data Mining and Applications. ADMA 2013. Lecture Notes in Computer Science(), vol 8346. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53914-5_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-53914-5_48

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-53913-8

  • Online ISBN: 978-3-642-53914-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics