Music Genre Classification Using Stacked Auto-Encoders

Scarpiniti, Michele; Scardapane, Simone; Comminiello, Danilo; Uncini, Aurelio

doi:10.1007/978-981-13-8950-4_2

Michele Scarpiniti⁷,
Simone Scardapane⁷,
Danilo Comminiello⁷ &
…
Aurelio Uncini⁷

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 151))

961 Accesses
5 Citations

Abstract

In this paper, we propose an architecture based on a stacked auto-encoder (SAE) for the classification of music genre. Each level in the stacked architecture works by stacking some hidden representations resulting from the previous level and related to different frames of the input signal. In this way, the proposed architecture shows a more robust classification compared to a standard SAE. The input to the first level of the SAE is fed by a set of 57 peculiar features extracted from the music signals. Some experimental results show the effectiveness of the proposed approach with respect to other state-of-the-art methods. In particular, the proposed architecture is compared to the support vector machine (SVM), multi-layer perceptron (MLP) and logistic regression (LR).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The library can be downloaded from: https://librosa.github.io/librosa/feature.html.

References

Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Article MathSciNet Google Scholar
Bourlard, H., Kamp, Y.: Auto-association by multilayer perceptrons and singular value decomposition. Biol. Cybern. 59, 291–294 (1988)
Article MathSciNet Google Scholar
Castán, D., Ortega, A.A.M., Lleida, E.: Audio segmentation-by-classification approach based on factor analysis in broadcast news domain. EURASIP J. Audio, Speech, Music. Process. 2014(34), 1–13 (2014)
Google Scholar
Choi, K., Fazekas, G., Cho, K., Sandler, M.: A tutorial on deep learning for music information retrieval arXiv:1709.04396 (2018)
Fu, Z., Lu, G., Ting, K.M., Zhang, D.: A survey of audio-based music classification and annotation. IEEE Trans. Multimed. 13(2), 303–319 (2011)
Article Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. The MIT Press (2016)
Google Scholar
Goulart, A.J.H., Guido, R.C., Maciel, C.D.: Exploring different approaches for music genre classification. Egypt. Inform. J. 13(2), 59–63 (2012)
Article Google Scholar
Hinton, G.E., Zemel, R.S.: Autoencoders, minimum description length, and Helmholtz free energy. In: Proceeding of NIPS 1993 (1994)
Google Scholar
Mandel, M., Ellis, D.: Song-level features and support vector machines for music classification. In: Proceeding of 6th International Symposium on Music Information Retrieval. London, UK (2005)
Google Scholar
Mierswa, I., Morik, K.: Automatic feature extraction for classifying audio data. Mach. Learn. 58(2–3), 127–149 (2005)
Article Google Scholar
Pampalk, E., Flexer, A., Widmer, G.: Improvements of audio based music similarity and genre classification? In: Proceeding of 6th International Symposium on Music Information Retrieval. London, UK (2005)
Google Scholar
Patsis, Y., Verhelst, W.: A speech/music/silence/garbage/ classifier for searching and indexing broadcast news material. In: Proceeding of 19th International Workshop on Database and Expert Systems Application (DEXA ’08). Turin, Italy (2008)
Google Scholar
Poria, S., Gelbukh, A., Hussain, A., Bandyopadhyay, S., Howard, N.: Music genre classification: A semi-supervised approach. In: Proceeding of the Mexican Conference on Pattern Recognition (MCPR 2013), pp. 254–263 (2013)
Google Scholar
Scardapane, S., Comminiello, D., Scarpiniti, M., Uncini, A.: Music classification using extreme learning machines. In: 8th International Symposium on Image and Signal Processing and Analysis (ISPA2013), pp. 377–381. Trieste, Italy (2013)
Google Scholar
Scaringella, N., Zoia, G., Mlynek, D.: Automatic genre classification of music content: a survey. IEEE Signal Process. Mag. 23(2), 133–141 (2006)
Article Google Scholar
Shao, X., Xu, C., Kankanhalli, M.: Unsupervised classification of musical genre using hidden Markov model. In: IEEE International Conference of Multimedia Explore (ICME 2004). Taiwan (2004)
Google Scholar
Silla, C.N., Kaestner, C.A., Koerich, A.L.: Automatic music genre classification using ensemble of classifiers. In: IEEE International Conference on Systems, Man and Cybernetics, pp. 1687–1692 (2007)
Google Scholar
Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10(5), 293–302 (2002)
Article Google Scholar
Vavrek, J., Vozáriková, E., Pleva, M., Juhár, J.: Broadcast news audio classification using SVM binary trees. In: Proceeding of the 35th International Conference on Telecommunications and Signal Processing (TSP 2012) (2012)
Google Scholar
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the Twenty-fifth International Conference on Machine Learning (ICML’08), pp. 1096–1103 (2008)
Google Scholar
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Engineering, Electronics and Telecommunications (DIET), “Sapienza” University of Rome, Rome, Italy
Michele Scarpiniti, Simone Scardapane, Danilo Comminiello & Aurelio Uncini

Authors

Michele Scarpiniti
View author publications
You can also search for this author in PubMed Google Scholar
Simone Scardapane
View author publications
You can also search for this author in PubMed Google Scholar
Danilo Comminiello
View author publications
You can also search for this author in PubMed Google Scholar
Aurelio Uncini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michele Scarpiniti .

Editor information

Editors and Affiliations

Department of Psychology, University of Campania Luigi Vanvitelli, Caserta, Italy
Anna Esposito
Tecnocampus, Mataró, Spain
Marcos Faundez-Zanuy
Department of Civil, Environment, Energy and Materials Engineering, Mediterranea University of Reggio Calabria, Reggio Calabria, Italy
Francesco Carlo Morabito
Dipartimento di Elettronica e Telecomunicazioni, Politecnico di Torino, Turin, Italy
Eros Pasero

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Scarpiniti, M., Scardapane, S., Comminiello, D., Uncini, A. (2020). Music Genre Classification Using Stacked Auto-Encoders. In: Esposito, A., Faundez-Zanuy, M., Morabito, F., Pasero, E. (eds) Neural Approaches to Dynamics of Signal Exchanges. Smart Innovation, Systems and Technologies, vol 151. Springer, Singapore. https://doi.org/10.1007/978-981-13-8950-4_2

Download citation

DOI: https://doi.org/10.1007/978-981-13-8950-4_2
Published: 19 September 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-8949-8
Online ISBN: 978-981-13-8950-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics