Skip to main content

Music Genre Classification Using Stacked Auto-Encoders

  • Chapter
  • First Online:
Neural Approaches to Dynamics of Signal Exchanges

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 151))

Abstract

In this paper, we propose an architecture based on a stacked auto-encoder (SAE) for the classification of music genre. Each level in the stacked architecture works by stacking some hidden representations resulting from the previous level and related to different frames of the input signal. In this way, the proposed architecture shows a more robust classification compared to a standard SAE. The input to the first level of the SAE is fed by a set of 57 peculiar features extracted from the music signals. Some experimental results show the effectiveness of the proposed approach with respect to other state-of-the-art methods. In particular, the proposed architecture is compared to the support vector machine (SVM), multi-layer perceptron (MLP) and logistic regression (LR).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The library can be downloaded from: https://librosa.github.io/librosa/feature.html.

References

  1. Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)

    Article  MathSciNet  Google Scholar 

  2. Bourlard, H., Kamp, Y.: Auto-association by multilayer perceptrons and singular value decomposition. Biol. Cybern. 59, 291–294 (1988)

    Article  MathSciNet  Google Scholar 

  3. Castán, D., Ortega, A.A.M., Lleida, E.: Audio segmentation-by-classification approach based on factor analysis in broadcast news domain. EURASIP J. Audio, Speech, Music. Process. 2014(34), 1–13 (2014)

    Google Scholar 

  4. Choi, K., Fazekas, G., Cho, K., Sandler, M.: A tutorial on deep learning for music information retrieval arXiv:1709.04396 (2018)

  5. Fu, Z., Lu, G., Ting, K.M., Zhang, D.: A survey of audio-based music classification and annotation. IEEE Trans. Multimed. 13(2), 303–319 (2011)

    Article  Google Scholar 

  6. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. The MIT Press (2016)

    Google Scholar 

  7. Goulart, A.J.H., Guido, R.C., Maciel, C.D.: Exploring different approaches for music genre classification. Egypt. Inform. J. 13(2), 59–63 (2012)

    Article  Google Scholar 

  8. Hinton, G.E., Zemel, R.S.: Autoencoders, minimum description length, and Helmholtz free energy. In: Proceeding of NIPS 1993 (1994)

    Google Scholar 

  9. Mandel, M., Ellis, D.: Song-level features and support vector machines for music classification. In: Proceeding of 6th International Symposium on Music Information Retrieval. London, UK (2005)

    Google Scholar 

  10. Mierswa, I., Morik, K.: Automatic feature extraction for classifying audio data. Mach. Learn. 58(2–3), 127–149 (2005)

    Article  Google Scholar 

  11. Pampalk, E., Flexer, A., Widmer, G.: Improvements of audio based music similarity and genre classification? In: Proceeding of 6th International Symposium on Music Information Retrieval. London, UK (2005)

    Google Scholar 

  12. Patsis, Y., Verhelst, W.: A speech/music/silence/garbage/ classifier for searching and indexing broadcast news material. In: Proceeding of 19th International Workshop on Database and Expert Systems Application (DEXA ’08). Turin, Italy (2008)

    Google Scholar 

  13. Poria, S., Gelbukh, A., Hussain, A., Bandyopadhyay, S., Howard, N.: Music genre classification: A semi-supervised approach. In: Proceeding of the Mexican Conference on Pattern Recognition (MCPR 2013), pp. 254–263 (2013)

    Google Scholar 

  14. Scardapane, S., Comminiello, D., Scarpiniti, M., Uncini, A.: Music classification using extreme learning machines. In: 8th International Symposium on Image and Signal Processing and Analysis (ISPA2013), pp. 377–381. Trieste, Italy (2013)

    Google Scholar 

  15. Scaringella, N., Zoia, G., Mlynek, D.: Automatic genre classification of music content: a survey. IEEE Signal Process. Mag. 23(2), 133–141 (2006)

    Article  Google Scholar 

  16. Shao, X., Xu, C., Kankanhalli, M.: Unsupervised classification of musical genre using hidden Markov model. In: IEEE International Conference of Multimedia Explore (ICME 2004). Taiwan (2004)

    Google Scholar 

  17. Silla, C.N., Kaestner, C.A., Koerich, A.L.: Automatic music genre classification using ensemble of classifiers. In: IEEE International Conference on Systems, Man and Cybernetics, pp. 1687–1692 (2007)

    Google Scholar 

  18. Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10(5), 293–302 (2002)

    Article  Google Scholar 

  19. Vavrek, J., Vozáriková, E., Pleva, M., Juhár, J.: Broadcast news audio classification using SVM binary trees. In: Proceeding of the 35th International Conference on Telecommunications and Signal Processing (TSP 2012) (2012)

    Google Scholar 

  20. Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the Twenty-fifth International Conference on Machine Learning (ICML’08), pp. 1096–1103 (2008)

    Google Scholar 

  21. Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michele Scarpiniti .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Scarpiniti, M., Scardapane, S., Comminiello, D., Uncini, A. (2020). Music Genre Classification Using Stacked Auto-Encoders. In: Esposito, A., Faundez-Zanuy, M., Morabito, F., Pasero, E. (eds) Neural Approaches to Dynamics of Signal Exchanges. Smart Innovation, Systems and Technologies, vol 151. Springer, Singapore. https://doi.org/10.1007/978-981-13-8950-4_2

Download citation

Publish with us

Policies and ethics