ABSTRACT
There are various genres of music available in every period and field of human life. Every music genre represents a set of shared conventions. Today people have the opportunity to listen to any genre of music they want using various music platforms. However, with the increasing number of music genres, the management of these platforms becomes difficult. Language representation models such as BERT, DistilBERT have been proven to be useful in learning universal language representations. Such language representation models have achieved amazing results in many language understanding tasks. In this study, we apply language representation models for music genre classification using song lyrics. We examine whether language representation models are better than traditional deep learning models for music genre classification by comparing results and computation times. Experimental results show that BERT outperforms other models on one-label and multi-label classification with accuracy of 77.63% and 71.29% respectively. On the other hand, considering the time taken for one epoch, BERT runs 4 times faster than DistilBERT.
- Adhikari, Ashutosh, Achyudh Ram, Raphael Tang, and Jimmy Lin. 2019. “Docbert: Bert for Document Classification.” arXiv Preprint arXiv:1904.08398.Google Scholar
- Araújo Lima, Raul de, Rômulo César Costa de Sousa, Simone Diniz Junqueira Barbosa, and Hélio Cortês Vieira Lopes. 2020. “Brazilian Lyrics-Based Music Genre Classification Using a Blstm Network.” http://arxiv.org/abs/2003.05377.Google Scholar
- Chang, Wei-Cheng, Hsiang-Fu Yu, Kai Zhong, Yiming Yang, and Inderjit Dhillon. 2019. “X-Bert: EXtreme Multi-Label Text Classification with Using Bidirectional Encoder Representations from Transformers.” arXiv Preprint arXiv:1905.02331.Google Scholar
- Chia, Yew Ken, Sam Witteveen, and Martin Andrews. 2019. “Transformer to Cnn: Label-Scarce Distillation for Efficient Text Classification.” http://arxiv.org/abs/1909.03508.Google Scholar
- Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. “BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding.” http://arxiv.org/abs/1810.04805.Google Scholar
- Howard, Sam, Carlos N. Silla Jr, and Colin G. Johnson. 2011. “Automatic Lyrics-Based Music Genre Classification in a Multilingual Setting.” In Thirteenth Brazilian Symposium on Computer Music. https://kar.kent.ac.uk/33266/.Google Scholar
- Huang, Derek A, Arianna A Serafini, and Eli J Pugh. n.d. “Music Genre Classification.”Google Scholar
- “Hugging Face – on a Mission to Solve Nlp, One Commit at a Time.” n.d. Hugging Face – on a Mission to Solve NLP, One Commit at a Time.https://huggingface.co/.Google Scholar
- Johnson, Rie, and Tong Zhang. 2016. “Supervised and Semi-Supervised Text Categorization Using Lstm for Region Embeddings.” http://arxiv.org/abs/1602.02373.Google Scholar
- Johnson, Rie, and Tong Zhang. 2017. “Deep Pyramid Convolutional Neural Networks for Text Categorization.” In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 562–70.Google Scholar
- Diederik P Kingma and Jimmy Ba. 2014. Adam:Amethodforstochasticoptimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Li, Wenting, Shangbing Gao, Hong Zhou, Zihe Huang, Kewen Zhang, and Wei Li. 2019. “The Automatic Text Classification Method Based on Bert and Feature Union.” In2019 Ieee 25th International Conference on Parallel and Distributed Systems (Icpads), 774–77. IEEE.Google Scholar
- Lin, Zhouhan, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. “A Structured Self-Attentive Sentence Embedding.” arXiv Preprint arXiv:1703.03130.Google Scholar
- Liyuan Liu, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Jiawei Han. 2019. On the variance of the adaptive learning rate and beyond. arXiv preprint arXiv:1908.03265 (2019).Google Scholar
- Liu, Pengfei, Xipeng Qiu, and Xuanjing Huang. 2016. “Recurrent Neural Network for Text Classification with Multi-Task Learning.” arXiv Preprint arXiv:1605.05101.Google Scholar
- Mayer, Rudolf, Robert Neumayer, and Andreas Rauber. 2008a. “Combination of Audio and Lyrics Features for Genre Classification in Digital Audio Collections.” In Proceedings of the 16th Acm International Conference on Multimedia, 159–68.Google Scholar
- Mayer, Rudolf, Robert Neumayer, and Andreas Rauber. 2008b. “Rhyme and Style Features for Musical Genre Classification by Song Lyrics.” In Ismir, 337–42.Google Scholar
- Mayer, Rudolf, and Andreas Rauber. 2011. “Musical Genre Classification by Ensembles of Audio and Lyrics Features.” In Proceedings of International Conference on Music Information Retrieval, 675–80.Google Scholar
- Munikar, Manish, Sushil Shakya, and Aakash Shrestha. 2019. “Fine-Grained Sentiment Classification Using Bert.” In2019 Artificial Intelligence for Transforming Business and Society (Aitb), 1:1–5. IEEE.Google Scholar
- Neumayer, Robert, and Andreas Rauber. 2007. “Integration of Text and Audio Features for Genre Classification in Music Information Retrieval.” In European Conference on Information Retrieval, 724–27. Springer.Google Scholar
- Oramas, Sergio, Francesco Barbieri, Oriol Nieto, and Xavier Serra. 2018. “Multimodal Deep Learning for Music Genre Classification.” Transactions of the International Society for Music Information Retrieval. 2018; 1 (1): 4-21.Google Scholar
- Oramas, Sergio, Oriol Nieto, Francesco Barbieri, and Xavier Serra. 2017. “Multi-Label Music Genre Classification from Audio, Text, and Images Using Deep Features.” http://arxiv.org/abs/1707.04916.Google Scholar
- Radford, Alec, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. “Improving Language Understanding with Unsupervised Learning.” Technical Report, OpenAI.Google Scholar
- Sanh, Victor, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. “DistilBERT, a Distilled Version of Bert: Smaller, Faster, Cheaper and Lighter.” http://arxiv.org/abs/1910.01108.Google Scholar
- Schuster, Mike, and Kuldip Paliwal. 1997a. “Bidirectional Recurrent Neural Networks.” Signal Processing, IEEE Transactions on 45 (December): 2673–81. https://doi.org/10.1109/78.650093.Google ScholarDigital Library
- Schuster, Mike, and Kuldip K Paliwal. 1997b. “Bidirectional Recurrent Neural Networks.” IEEE Transactions on Signal Processing 45 (11): 2673–81.Google ScholarDigital Library
- Shen, Dinghan, Yizhe Zhang, Ricardo Henao, Qinliang Su, and Lawrence Carin. 2018. “Deconvolutional Latent-Variable Model for Text Sequence Matching.” In Thirty-Second Aaai Conference on Artificial Intelligence.Google Scholar
- Sun, Chi, Xipeng Qiu, Yige Xu, and Xuanjing Huang. 2019. “How to Fine-Tune Bert for Text Classification?” In China National Conference on Chinese Computational Linguistics, 194–206. Springer.Google Scholar
- Tsaptsinos, Alexandros. 2017. “Music Genre Classification by Lyrics Using a Hierarchical Attention Network.” In. ICME.Google Scholar
- Tzanetakis, George, and Perry Cook. 2002. “Musical Genre Classification of Audio Signals.” IEEE Transactions on Speech and Audio Processing 10 (5): 293–302.Google ScholarCross Ref
- Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. “Attention Is All You Need.” In Advances in Neural Information Processing Systems, 5998–6008.Google Scholar
- Yang, Zichao, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. “Hierarchical Attention Networks for Document Classification.” In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1480–9.Google ScholarCross Ref
- Ying, Teh Chao, Shyamala Doraisamy, and Lili Nurliyana Abdullah. 2012. “Genre and Mood Classification Using Lyric Features.” In2012 International Conference on Information Retrieval & Knowledge Management, 260–63. IEEE.Google Scholar
- Yogatama, Dani, Chris Dyer, Wang Ling, and Phil Blunsom. 2017. “Generative and Discriminative Text Classification with Recurrent Neural Networks.” arXiv Preprint arXiv:1703.0189Google Scholar
Index Terms
- Language Representation Models for Music Genre Classification Using Lyrics
Recommendations
Music/lyrics composition system considering user's image and music genre
SMC'09: Proceedings of the 2009 IEEE international conference on Systems, Man and CyberneticsThis paper proposes a music/lyrics composition system consisting of two sections, a lyric composing section and a music composing section, which considers user's image of a song and music genre. First of all, a user has an image of music/lyrics to ...
Music genre classification using MIDI and audio features
We report our findings on using MIDI files and audio features from MIDI, separately and combined together, for MIDI music genre classification. We use McKay and Fujinaga's 3-root and 9-leaf genre data set. In order to compute distances between MIDI ...
Genre classification of music by tonal harmony
Machine Learning and MusicIn this paper we present a genre classification framework for audio music based on a symbolic classification system. Audio signals are transformed into a symbolic representation of harmony using a chord transcription algorithm, based on the computation ...
Comments