Skip to main content
Log in

Development of an intelligent model for musical key estimation using machine learning techniques

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

A Correction to this article was published on 13 April 2022

This article has been updated

Abstract

Every piece of music is characterized by its key, melody, harmony, metre, and rhythm. Musical information retrieval tasks like transcription, chord estimation, and harmony analysis require musical key data as the fundamental comprehension for their implementation. Even though several investigations were carried out by researchers aimed at developing an optimum key profile for a given melody, the possibilities of finding the key using machine learning techniques have been least explored. In this paper, we present a novel approach to determine the musical key of a given song. The proposed model features a simple architecture for learning and classification. It was tested with four distinct machine learning algorithms namely K-nearest neighbor (KNN), Naïve Bayes (NB), Discriminant Analysis (DA), and Support Vector Machine (SVM). In addition, a dataset of different genres of music has been compiled, for our experiments. The Pitch Class Profile (PCP) distribution of our dataset has been compared with renowned datasets and it showed similar distribution with the others. We optimized our model with the best classifier from all the four machine learning techniques we used. Out of the four machine learning algorithms used in our model, the SVM gave an accuracy value of 91.49% with the highest precision and recall values. The KNN approach showed an accuracy of 89.76% followed by Naïve Bayes and the Discriminant Analysis classifiers with an accuracy of 87.11% and 86.77% respectively. Also, the error rates of these different approaches ranged from 8.51% to 13.23%. These results show that the proposed model with SVM algorithm has a considerably higher accuracy value, and in comparison with recent publications, it is evident that our model can play a pivotal role in the efficient determination of keys since it brings together information related to musical theory and supervised learning techniques for classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Change history

References

  1. Abbas S, Jalil Z, Javed AR, Batool I, Khan MZ, Noorwali A, Gadekallu TR, Akbar A (2021) BCD-WERT: a novel approach for breast cancer detection using whale optimization based efficient features and extremely randomized tree algorithm. PeerJ Comput Sci 7:e390. https://doi.org/10.7717/peerj-cs.390

    Article  Google Scholar 

  2. Aljanaki A (2011) Automatic musical key detection. Master’s thesis, University of Tartu

  3. Bashir AK, Suleman K, Prabadevi B, Deepa N, Alnumay WS, Gadekallu TR, Maddikunta PKR (2021) Comparative analysis of machine learning algorithms for prediction of smart grid stability. Int Trans Electr Energy Syst. https://doi.org/10.1002/2050-7038.12706

  4. Bernardes G, Davies MEP, Guedes C (2017) Automatic musical key estimation with adaptive mode bias. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 316–320

  5. Bosch J, Marxer R, Gomez E (2016) Evaluation and combination of pitch estimation methods for melody extraction in symphonic classical music. J New Music Res 45:101–117

    Article  Google Scholar 

  6. Boulanger-Lewandowski N, Vincent P, Bengio Y (2012) Modeling temporal dependencies in high - dimensional sequences: application to polyphonic music generation and transcription. Proceedings of the 29th international conference on machine learning (ICML-12), pp 1159–1166

  7. Chuan C-H, Chew E (2005) Polyphonic audio key finding using the spiral Array CEG algorithm. IEEE International Conference on Multimedia and Expo, Amsterdam. pp 21–24. https://doi.org/10.1109/ICME.2005.1521350

  8. Chordia P, Şentürk S (2013) Joint recognition of Raag and tonic in north Indian music. Comput Music J 37:82–98. https://doi.org/10.1162/COMJ_a_00194

    Article  Google Scholar 

  9. Cortes C, Vapnik V (1995) Support-vector networks. Machine Learning, pp 273–297

  10. Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13:21–27. https://doi.org/10.1109/TIT.1967.1053964

    Article  MATH  Google Scholar 

  11. Demirel E, Bozkurt B, Serra X (2019) Automatic chord-scale recognition using harmonic pitch class profiles. Sound & Music Computing Conference (SMC), Málaga, Spain

  12. Eerola T, Toiviainen P (2004) Suomen Kansan eSävelmät. Finnish Folk Song Database. [11.3.2004]. Available: http://www.jyu.fi/musica/sks/

  13. Finley M, Razi A (2019) Musical Key Estimation with Unsupervised Pattern Recognition. IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA 0401–0408. https://doi.org/10.1109/CCWC.2019.8666620

  14. Gao Y, Zhu B, Li W, Li K, Wu Y, Huang F (2019) Vocal melody extraction via DNN-based pitch estimation and salience-based pitch refinement. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, United Kingdom, pp 1000–1004. https://doi.org/10.1109/ICASSP.2019.8683608

  15. Gfeller B, Frank C, Roblek D, Sharifi M, Tagliasacchi M, Velimirović M (2020) Pitch estimation via self-supervision. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, pp. 3527–3531. doi: https://doi.org/10.1109/ICASSP40776.2020.9053798

  16. Gómez E, Herrera P, Ong B (2006) Automatic tonal analysis from music summaries for version identification. J Audio Eng Soc

  17. Huang CF (2020) An innovative method of algorithmic composition using musical tension. Multimed Tools Appl 79:32119–32136. https://doi.org/10.1007/s11042-020-09506-0

    Article  Google Scholar 

  18. Huang Y, Li L (2011) Naive Bayes classification algorithm based on small sample set. IEEE International Conference on Cloud Computing and Intelligence Systems, Beijing, pp 34–39. https://doi.org/10.1109/CCIS.2011.6045027

  19. Inoshita T, Katto J (2009) Key estimation using circle of fifths. In: Huet B, Smeaton A, Mayer-Patel K, Avrithis Y (eds) Advances in multimedia modeling. Lecture notes in computer science. Springer, Berlin, p 5371. https://doi.org/10.1007/978-3-540-92892-8_31

    Chapter  Google Scholar 

  20. Kaluri R, Rajput DS, Xin Q, Lakshman K, Bhattacharya S, Gadekallu T, Reddy P (2021) Roughsets-based approach for predicting battery life in IoT. Intell Autom Soft Comput 27:453–469. https://doi.org/10.32604/iasc.2021.014369

    Article  Google Scholar 

  21. Katte T, Tiple BS (2014) Techniques for Indian classical raga identification- a survey. Annual IEEE India Conference (INDICON), Pune, pp 1–6. https://doi.org/10.1109/INDICON.2014.7030372

  22. Krueger B (2018) Classical Piano. 1996 to 2018 [Online]. Available: http://www.piano-midi.de/

  23. Krumhansl CL (1990) Cognitive foundations of musical pitch. Oxford psychology series. Oxford University Press

    Google Scholar 

  24. Kumar V, Pandya H, Jawahar CV (2014) Identifying Ragas in Indian Music. 22nd International Conference on Pattern Recognition, Stockholm, pp 767–772. https://doi.org/10.1109/ICPR.2014.142

  25. Lee K, Slaney M (2008) Acoustic chord transcription and key extraction from audio using key-dependent HMMs trained on synthesized audio. IEEE Trans Audio Speech Lang Process 16:291–301

    Article  Google Scholar 

  26. Lele JA, Abhyankar AS (2019) Towards Raga Identification of Hindustani Classical Music. IEEE Pune Section International Conference (PuneCon), Pune, India, pp 1–4. https://doi.org/10.1109/PuneCon46936.2019.9105894

  27. Li T, Zhu S, Ogihara M (2006) Using discriminant analysis for multi-class classification: an experimental investigation. Knowledge and information systems, pp 453–472

  28. Madsen S, Widmer G (2007) Key-finding with interval profiles. International Computer Music Conference

  29. Mahieu R (2016) Detecting musical key with supervised learning.

  30. Moog B (1986) MIDI: musical instrument digital interface. J Audio Eng Soc 34:394–404

    Google Scholar 

  31. Parncutt R (1994) A perceptual model of pulse salience and metrical accent in musical rhythms. Music Percept 11:409–464

    Article  Google Scholar 

  32. Pauws S (2004) Musical key extraction from audio. ISMIR 2004, 5th international conference on music information retrieval, Barcelona, Spain

  33. Rocher T, Robine M, Hanna P, Oudre L (2010) Concurrent estimation of chords and keys from audio. ISMIR

    Google Scholar 

  34. Rodriguez Y, De Baets B, Garcia MM, Morell C, Grau R (2008) A correlation-based distance function for nearest neighbor classification. In: Ruiz-Shulcloper J, Kropatsch WG (eds) Progress in pattern recognition, image analysis and applications. Lecture notes in computer science 5197. Springer, Berlin. https://doi.org/10.1007/978-3-540-85920-8_35

    Chapter  Google Scholar 

  35. Romani Picas O, Dabiri D, Serra X (2015) A real-time system for meauring sound goodness in instrumental sounds.138th audio engineering society convention, Warsarw

  36. Schreiber H, Müller M (2019) Musical tempo and key estimation using convolutional neural networks with directional filters. Sound & Music Computing Conference (SMC), Málaga, Spain

  37. Schreiber H, Weiß C, Müller M (2020) Local key estimation in classical music recordings: a cross-version study on Schubert’s Winterreise. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, pp 501–505. https://doi.org/10.1109/ICASSP40776.2020.9054642

  38. Schuller B, Gollan B (2012) Music theoretic and perception-based features for audio key determination. J New Music Res 41:175–193. https://doi.org/10.1080/09298215.2011.618543

    Article  Google Scholar 

  39. Sinith MS, Shikha T, Murthy KVV (2020) Raga recognition using fibonacci series based pitch distribution in Indian classical music. Appl Acoust 167:1–7. https://doi.org/10.1016/j.apacoust.2020.107381

    Article  Google Scholar 

  40. Temperley D (1999) What's key for key? The Krumhansl-Schmuckler key-finding algorithm reconsidered. Music Percept 17:65–100. https://doi.org/10.2307/40285812

    Article  Google Scholar 

  41. Temperley D, Marvin EW (2008) Pitch-class distribution and the identification of key. Music Percept 25:193–212. https://doi.org/10.1525/mp.2008.25.3.193

    Article  Google Scholar 

  42. Waghmare KC, Sonkamble BA (2019) Analyzing acoustics of indian music audio signal using timbre and pitch features for raga identification. 3rd International Conference on Imaging, Signal Processing and Communication (ICISPC), Singapore, pp 42–46. https://doi.org/10.1109/ICISPC.2019.8935707

  43. Wu Y, Li W (2018) Music chord recognition based on Midi-trained deep feature and BLSTM-CRF Hybird decoding. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, pp 376–380. https://doi.org/10.1109/ICASSP.2018.8461439.

  44. Xing B, Zhang X, Zhang K, Wu X, Zhang H, Zheng J, Zhang L, Sun S (2020) PopMash: an automatic musical - mashup system using computation of musical and lyrical agreement for transitions. Multimed Tools Appl 79:21841–21871. https://doi.org/10.1007/s11042-020-08934-2

    Article  Google Scholar 

  45. You Y, Yang H, Xu H, Zhou Y (2019) Music tonality detection based on Krumhansl-Schmuckler Profile. IEEE 8th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), Chongqing, China, pp 85–88. https://doi.org/10.1109/ITAIC.2019.8785576

  46. Zhiyong C, Jialie S (2016) On effective location-aware music recommendation. ACM Trans Inf Syst 34, (2):1–32. Research collection school of information systems. Available at: https://ink.library.smu.edu.sg/sis_research/3178

  47. Y Zhu, Kankanhalli MS, Gao S (2005) Music key detection for musical audio. 11th International Multimedia Modelling Conference, Melbourne, Australia, pp 30–37. https://doi.org/10.1109/MMMC.2005.56

  48. Zhu Y, Member M, Kankanhalli S (2006) Precise pitch profile feature extraction from musical audio for key detection. IEEE Trans Multimed 8:575–584

    Article  Google Scholar 

Download references

Acknowledgments

We thank Dr. Aswan Thomas Elias (Director, Harp N′ Lyre Academia Musica, Kottayam) and Mr. Siju T. Mathew (HOD Western Music, Indian Music, Band, and IBDP CAS Coordinator, Good Shepherd International School, Ooty) for providing musical advice during the initial phase of our work.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study’s conception and design. Material preparation, data collection, and analysis were performed by Abraham George and Dr. X. Anitha Mary. The first draft of the manuscript was written by Abraham George. Reviewing, editing, and approval of the final manuscript were done by Dr. X. Anitha Mary and Dr. S. Thomas George.

Corresponding author

Correspondence to X. Anitha Mary.

Ethics declarations

Ethics approval

No human participants or animals were involved in this research.

Conflicts of interest/competing interests

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: The corresponding author was incorrect.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

George, A., Mary, X.A. & George, S.T. Development of an intelligent model for musical key estimation using machine learning techniques. Multimed Tools Appl 81, 19945–19964 (2022). https://doi.org/10.1007/s11042-022-12432-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12432-y

Keywords

Navigation