Abstract
Music genre classification plays a crucial role in organizing and exploring large music collections, enabling personalized music recommendations, and enhancing music-related services. This paper presents a novel approach to music genre classification using Generative Adversarial Networks (GANs), Fourier Transform, and Wavelet Transform. The main objective is to leverage the power of GANs to extract discriminative features from audio data and accurately classify music into different genres. The proposed methodology involves two key components: the generator and the discriminator. The generator generates synthetic audio samples that resemble real music, while the discriminator learns to distinguish between real and synthetic audio samples. By training the GAN on a diverse dataset of music samples from various genres, the discriminator becomes proficient in recognizing genre-specific features. To enhance classification accuracy, Fourier Transform and Wavelet Transform are applied to extract both frequency and time-domain features from the audio data. Additionally, classifiers such as support vector machines and neural networks are employed to effectively distinguish between different music genres. The experimental results demonstrate the effectiveness of the proposed approach across multiple datasets. The method achieves 98.97% accuracy on the GTZAN dataset, 92.47% accuracy on the FMA-Small dataset, and 92.98% accuracy on the ISMIR Genre dataset, significantly outperforming traditional classification methods These results highlight the power of GANs, Fourier Transform, and Wavelet Transform in enhancing the accuracy and robustness of music genre classification.








Similar content being viewed by others
Data availability
The dataset generated and analyzed during the current study is available from the corresponding author on reasonable request.
References
Ding H, et al. Genre classification empowered by knowledge-embedded music representation. IEEE/ACM Trans Audio Speech Language Process. 2024;32:2764–76. https://doi.org/10.1109/TASLP.2024.3402115.
Seo W, Cho S-H, Teisseyre P, Lee J. A short survey and comparison of CNN-based music genre classification using multiple spectral features. IEEE Access. 2024;12:245–57. https://doi.org/10.1109/ACCESS.2023.3346883.
Jiménez-Bravo DM, Lozano Murciego Á, José Navarro-Cáceres J, Navarro-Cáceres M, Harkin T. Identifying Irish traditional music genres using latent audio representations. IEEE Access. 2024;12:92536–48. https://doi.org/10.1109/ACCESS.2024.3421639.
Kancharlapalli TP, Dwivedi P. A novel approach for age and gender detection using deep convolution neural network. In: 2023 10th International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 2023. p. 873–78.
Dwivedi P, Sharan B. Deep inception based convolutional neural network model for facial key-points detection. In: 2022 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS), Greater Noida, India, 2022. p. 792–99. https://doi.org/10.1109/ICCCIS56430.2022.10037639.
Van Den Oord A, Dieleman S, Zen H, Simonyan K, Vinyals O, Graves A, Kalchbrenner N, Senior A, Kavukcuoglu K. Wavenet: a generative model for raw audio. 2016. arXiv preprint arXiv:1609.03499.
Lonce W. Audio spectrogram representations for processing with convolutional neural networks. 2017. arXiv preprint arXiv:1706.09559.
Li Tom LH, Chan Antoni B, Chun A. Automatic musical pattern feature extraction using convolutional neural network. Data Mining and Applications: In Proc. Int. Conf; 2010.
Thomas L. Alexander S. Parallel convolutional neural networks for music genre and mood classification. MIREX2016. 2016.
de Sousa JM, Torres Pereira E, Ribeiro Veloso L. A robust music genre classification approach for global and regional music datasets evaluation. In: 2016 IEEE International Conference on Digital Signal Processing (DSP), 2016. p. 109-13. https://doi.org/10.1109/ICDSP.2016.7868526.
Rameshkumar P, Monisha M, Santhi B, Vigneshwaran T. Robust feature selection method for music classification. Int Conf Comput Commun Inform. 2014;2014:1–6. https://doi.org/10.1109/ICCCI.2014.6921733.
Scaringella N, Zoia G, Mlynek D. Automatic genre classification of music content: a survey. IEEE Signal Process Mag. 2006;23(2):133–41. https://doi.org/10.1109/MSP.2006.1598089.
Essid S, Richard G, David B. Instrument recognition in polyphonic music based on automatic taxonomies. IEEE Trans Audio Speech Lang Process. 2006;14(1):68–80. https://doi.org/10.1109/TSA.2005.860351.
Bagci, Erzin. Boosting classifiers for music genre classification. In: 2006 IEEE 14th Signal Processing and Communications Applications, 2006. p. 1–3. https://doi.org/10.1109/SIU.2006.1659881.In
Silla CN, Jr., Kaestner CAA, Koerich AL. Automatic music genre classification using ensemble of classifiers. In: 2007 IEEE International Conference on Systems, Man and Cybernetics, 2007. p. 1687-692. https://doi.org/10.1109/ICSMC.2007.4414136.
Silla CN, Freitas AA. Novel top-down approaches for hierarchical classification and their application to automatic music genre classification. In: 2009 IEEE International Conference on Systems, Man and Cybernetics, 2009. p. 3499–504, https://doi.org/10.1109/ICSMC.2009.5346776.
Joder C, Essid S, Richard G. Temporal integration for audio classification with application to musical instrument classification. IEEE Trans Audio Speech Lang Process. 2009;17(1):174–86. https://doi.org/10.1109/TASL.2008.2007613.
Shi Z, Han J, Zheng T, Li J. Identification of objectionable audio segments based on pseudo and heterogeneous mixture models. IEEE Trans Audio Speech Lang Process. 2013;21(3):611–23. https://doi.org/10.1109/TASL.2012.2229980.
Lima MFM, Machado JAT. Towards a classification scheme for musical sounds. In: 2013 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), 2013. p. 195–99. This paper analyzes musical opus of different musical styles.
Pooransingh A, Dhoray D. Similarity analysis of modern genre music based on billboard hits. IEEE Access. 2021;9:144916–26. https://doi.org/10.1109/ACCESS.2021.3122386.
Bae Jun. Deep learning music genre classification system model improvement using generative adversarial networks (GAN). J Korea Inst Inform Commun Eng. 2020;24(7):842–8.
Dwivedi P, Islam B. An item-based collaborative filtering approach for movie recommendation system. In: 2023 10th International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 2023. p. 153–58.
Mehta S, Rastogi U, Dwivedi P. Deep CNN and LSTM Architecture-Based Approach for COVID-19 Detection. In: 2023 10th International Conference on Signal Processing and Integrated Networks (SPIN), Noida, India, 2023. p. 421–26, https://doi.org/10.1109/SPIN57001.2023.10117454.
Vishnupriya S, Meenakshi K. Automatic music genre classification using convolution neural network. Int Conf Comput Commun Inform. 2018;2018:1–4. https://doi.org/10.1109/ICCCI.2018.8441340.
Palkar VV, Joeg P. Proposing scalable method for music genre classification. Int Conf Invent Comput Technol (ICICT). 2016;2016:1–6. https://doi.org/10.1109/INVENTIVE.2016.7824800.
Aryafar K, Shokoufandeh A. Multimodal sparsity-eager support vector machines for music classification. In: 2014 13th International Conference on Machine Learning and Applications, 2014. p. 405–08. https://doi.org/10.1109/ICMLA.2014.72.
Smith Jordan B. L, Chuan Ching-Hua, Chew Elaine. Audio properties of perceived boundaries in music. IEEE Trans Multimedia. 2014;16(5):1219–28.
Serrà J, Corral Á, Boguñá M, et al. Measuring the evolution of contemporary western popular music. Sci Rep. 2012;2:521. https://doi.org/10.1038/srep00521.
Sordo Mohamed, Gouyon Fabien, Sarmento Luís, Celma Òscar, Serra Xavier. Inferring semantic facets of a music folksonomy with wikipedia. J New Music Res. 2013;42(4):346–63.
Liu C, Chao Z. Supervised learning and unsupervised learning on music data with different genres. In: 2021 IEEE 7th International Conference on Big Data Intelligence and Computing (DataCom), 2021. p. 7–12. https://doi.org/10.1109/DataCom53700.2021.00008.
Tzanetakis G, Cook P. Musical genre classification of audio signals. IEEE Trans Speech Audio Process. 2002;10(5):293–302. https://doi.org/10.1109/TSA.2002.800560.
Defferrard M, Benzi K, Vandergheynst P, Bresson X. FMA: a dataset for music analysis. In: 18th International Society for Music Information Retrieval Conference (ISMIR). Suzhou, China, Oct; 2017. p. 316–23.
Bello JP, Guaus E. A tutorial on onset detection in music signals. In: Proceedings of the 10th International Society for Music Information Retrieval Conference (ISMIR), Kobe, Japan, Oct. 2004. p. 101–06.
Dwivedi P, Padhi S, Chakraborty S, et al. Severity wise COVID-19 X-ray image augmentation and classification using structure similarity. Multimed Tools Appl. 2024;83:30719–40. https://doi.org/10.1007/s11042-023-16555-8.
Christine S, Thomas P, Florian M, Julien P. Music feature maps with convolutional neural networks for music genre classification. In: CBMI ’17: Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing, June 2017.
Yang Y, Sen L, Shenglan L, Hong Q, Yang L, Lin F. Deep attention based music genre classification. Neurocomputing. 2020;372:84–91.
Yang R, Feng L, Wang H, Yao J, Luo S. Parallel recurrent convolutional neural networks-based music genre classification method for mobile devices. IEEE Access. 2020;8:19629–37.
Julien D. Finding the genre of a song with deep learning-AI Odyssey Part. 1, 2018.
Juliano HF, Tiago FT. Texture selection for automatic music genre classification. Appl Soft Comput. 2020; 106127.
Funding
No funding received for this research.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflict of interest to declare that are relevant to the content of this article.
Human or animals rights
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Dwivedi, P., Islam, B. Generative Adversarial Networks Based Framework for Music Genre Classification. SN COMPUT. SCI. 5, 1149 (2024). https://doi.org/10.1007/s42979-024-03531-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-024-03531-8