Abstract
Music genre classification refers to identifying bits of music that belong to a certain tradition by assigning labels called genres. Recommendation systems automatically use classification techniques to group songs into their respective genres or to cluster music with similar genres. Studies show deep Recurrent Neural Networks (RNN) are capable of resolving complex temporal features of the audio signal and identifying music genres with good accuracy. This research experiments with different variants of RNN including LSTM, and IndRNN on the GTZAN dataset to predict the music genres. Scattering transforms along with Mel-Frequency Cepstral Coefficients (MFCCs) are used to construct the input feature vector. This study investigates various LSTM and simple RNN network architectures. Experiment results show a 5-layered stacked independent RNN was able to achieve 84% accuracy based on the aforementioned input feature vector.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Fu, Z., Lu, G., Ting, K., Zhang, D.: A survey of audio-based music classification and annotation. IEEE Trans. Multimedia 13, 303–319 (2010)
Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10, 293–302 (2002)
Sridhar, R., Geetha, T.: Music information retrieval of Carnatic songs based on Carnatic music singer identification. In: 2008 International Conference on Computer and Electrical Engineering, pp. 407–411 (2008)
Zlatintsi, A., Maragos, P.: Comparison of different representations based on nonlinear features for music genre classification. In: 2014 22nd European Signal Processing Conference (EUSIPCO), pp. 1547–1551 (2014)
Li, T., Ogihara, M., Li, Q.: A comparative study on content-based music genre classification. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 282–289 (2003)
Elbir, A., İlhan, H., Serbes, G., Aydın, N. Short time Fourier transform based music genre classification. In: 2018 Electric Electronics, Computer Science, Biomedical Engineerings’ Meeting (EBBT), pp. 1–4 (2018)
Boser, B., Guyon, I., Vapnik, V:. A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152 (1992)
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 21–27 (1967)
Rajanna, A., Aryafar, K., Shokoufandeh, A., Ptucha, R. Deep neural networks: a case study for music genre classification. In: 2015 IEEE 14th International Conference On Machine Learning And Applications (ICMLA), pp. 655–660 (2015)
Kaur, C.,Kumar, R.: Study and analysis of feature based automatic music genre classification using Gaussian mixture model. In: 2017 International Conference on Inventive Computing and Informatics (ICICI), pp. 465–468 (2017)
Nakashika, T., Takiguchi, T., Ariki, Y.: Voice conversion based on speaker-dependent restricted boltzmann machines. IEICE Trans. Inf. Syst. 97, 1403–1410 (2014)
Dieleman, S., Brakel, P., Schrauwen, B.: Audio-based music classification with a pretrained convolutional network. In: 12th International Society For Music Information Retrieval Conference (ISMIR-2011), pp. 669–674 (2011)
Feng, L., Liu, S., Yao, J.: Music genre classification with paralleling recurrent convolutional neural network. ArXiv Preprint ArXiv:1712.08370. (2017)
Li, T., Chan, A., Chun, A:. Automatic musical pattern feature extraction using convolutional neural network. Genre. 10, 1x1 (2010)
Lee, H., Pham, P., Largman, Y., Ng, A.: Unsupervised feature learning for audio classification using convolutional deep belief networks. In: 22nd Proceedings of Conference Advances In Neural Information Processing Systems (2009)
Hochreiter, S., Schmidhuber, J.: long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Vogler, B., Othman, A.: Music genre recognition. Benediktsvogler, Com (2016)
Li, S., Li, W., Cook, C., Zhu, C., Gao, Y.: Independently recurrent neural network (INDRNN): Building a longer and deeper RNN. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5457–5466 (2018)
Song, G., Wang, Z., Han, F., Ding, S., Iqbal, M.: Music auto-tagging using deep recurrent neural networks. Neurocomputing 292, 104–110 (2018)
Andén, J., Mallat, S.: Multiscale scattering for audio classification. In: SMIR, pp. 657–662 (2011)
Andén, J., Mallat, S.: Deep scattering spectrum. IEEE Trans. Signal Process. 62, 4114–4128 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kakarla, C., Eshwarappa, V., Babu Saheer, L., Maktabdar Oghaz, M. (2022). Recurrent Neural Networks for Music Genre Classification. In: Bramer, M., Stahl, F. (eds) Artificial Intelligence XXXIX. SGAI-AI 2022. Lecture Notes in Computer Science(), vol 13652. Springer, Cham. https://doi.org/10.1007/978-3-031-21441-7_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-21441-7_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21440-0
Online ISBN: 978-3-031-21441-7
eBook Packages: Computer ScienceComputer Science (R0)