Abstract
In this paper, we have proposed an automated tabla syllable transcription method using image processing technique. As for a beginner tabla learner, the learning is faster by visualizing things rather than just listening. Therefore, we have adopted this technique for our study. We have used a human perception based approach for learning tabla and implemented the same. We have created three regions of interest for each drum, dayan and bayan. The placement of the fingers’ image feature over this region is tracked to determine the exact region where it strikes and produces a particular syllable. Each frame is initially labeled to a syllable. Finally, we have used supervised classification to prune the labeling for each stroke based on its image for a particular syllable by comparing incoming frames to the reference image using the structural similarity index. Based on this the syllables are classified and automatic transcription is done. Using the proposed method, we are proficiently able to transcript 97.14% of the tabla syllables with F1 score of 0.98.
Similar content being viewed by others
References
Akbari M, Cheng H (2015) Real-time piano music transcription based on computer vision. IEEE Trans Multimed 17(12):2113–2121
Akbari M, Liang J, Cheng H A real-time system for online learning-based visual transcription of piano music. Multimed Tools Appl. https://doi.org/10.1007/s11042-018-5803-1
Bello JP (2003) Towards the automated analysis of simple polyphonic music: a knowledge-based approach (ph.d thesis)
Bello JP, Daudet L, Abdallah S, Duxbury C, Davies M, Sandler MB (2005) A tutorial on onset detection in music signals. IEEE Trans Speech Audio Process 13(5):1035–1047
Benetos E, Dixon S, Giannoulis D, Kirchhoff H, Klapuri A (2013) Automatic music transcription: challenges and future directions. J Intell Inf Syst 41:407–434
Benetos E, Dixon S, Duan Z, Ewert S (2019) Automatic music transcription: an overview. IEEE Signal Process Mag 36(1):20–30
Beronja S (2008) The art of the Indian tabla. Rupa and Co
Chordia P (2005) Segmentation and recognition of tabla strokes. In: Proceedings of ISMIR 2003, 4th international conference on music information retrieval, London, pp 385–391
Dressler K (2012) Multiple fundamental frequency extraction for mirex 2012
Downie JS (2003) Music information retrieval. Annu Rev Inf Sci Technol 37(1):295–340
Fan D, Liu J, Gao S, Hou Q, Borji A, Cheng M Salient objects in clutter: bringing salient object detection to the foreground, CoRR arXiv:1803.06091
Fernandes Tavares T, Garcia Arnal Barbedo J, Attux R, Lopes A (2013) Survey on automatic transcription of music. J Braz Comput Soc 19(4):589–604
Frisson C, Reboursière L, Chu WY, Lähdeoja O, Anderson MJ, Picard-Limpens C, Shen A, Todoroff T (2009) Multimodal guitar: performance toolbox and study workbench
Fu Z, Lu G, Ting KM, Zhang D (2011) A survey of audio-based music classification and annotation. IEEE Trans Multimed 13(2):303–319
Gillet O, Richard G (2003) Automatic labeling of tabla signals. In: Proceedings of ISMIR 2003, 4th international conference on music information retrieval, Baltimore
Gillet O, Richard G (2005) Automatic transcription of drum sequences using audiovisual features. In: IEEE international conference on acoustics, speech, and signal processing 3, vol 3, pp iii/205–iii/208
Gupta S, Srinivasamurthy A, Manoj Kumar PA, Murthy HA, Serra X (2015) Discovery of syllabic percussion patterns in tabla solo recordings. In: Proceedings of ISMIR 2015, 16th international conference on music information retrieval, Malaga
Maher RC (1990) Evaluation of a method for separating digitized duet signals. J Audio Eng Soc 12(38):956–979
Moorer JA (1977) On the transcription of musical sound by computer. Comput Music J 3(4):32–38
Paleari M, Huet B, Schutz A, Slock D (2008) A multimodal approach to music transcription. In: 2008 15th IEEE international conference on image processing, pp 93–96
Peeling PH, Godsill SJ (2011) Multiple pitch estimation using non-homogeneous poisson processes. IEEE J Sel Top Signal Process 5(6):1133–1143
Piszczalski M, Galler BA (1977) Automatic music transcription. Comput Music J 4(1):24–31
Raman CV (1934) The Indian musical drums. Proc Indian Acad Sci—Sect A 1(3):179–188
Sarkar R, Singh A, Mondal A, Saha SK (2018) Automatic extraction and identification of bol from tabla signal. In: Advanced computing and systems for security: volume five, pp 139–151
Scaringella N, Zoia G, Mlynek D (2006) Automatic genre classification of music content: a survey. IEEE Signal Process Mag 23(2):133–141
Schedl M, Gómez E, Urbano J (2014) Music information retrieval: recent developments and applications. Found Trends Inf Retr 8(2–3):127–261
Suteparuk P (2014) Detection of piano keys pressed in video. Dept. of Comput. Sci., Stanford Univ., Stanford, Tech. Rep.
Tomasi C, Kanade T (1991) Detection and tracking of point features. Tech. rep. International Journal of Computer Vision
Quenneville D (2018) Automatic music transcription. Bachelor of arts (thesis report). Middlebury College
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Zhang B, Zhu J, Wang Y, Leow WK (2007) Visual analysis of fingering for pedagogical violin transcription. In: Proceedings of the 15th ACM international conference on multimedia, MM ’07, pp 521–524
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bhalarao, R., Raval, M. Automated tabla syllable transcription using image processing techniques. Multimed Tools Appl 79, 28885–28899 (2020). https://doi.org/10.1007/s11042-020-09417-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09417-0