Skip to main content
Log in

Automated tabla syllable transcription using image processing techniques

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, we have proposed an automated tabla syllable transcription method using image processing technique. As for a beginner tabla learner, the learning is faster by visualizing things rather than just listening. Therefore, we have adopted this technique for our study. We have used a human perception based approach for learning tabla and implemented the same. We have created three regions of interest for each drum, dayan and bayan. The placement of the fingers’ image feature over this region is tracked to determine the exact region where it strikes and produces a particular syllable. Each frame is initially labeled to a syllable. Finally, we have used supervised classification to prune the labeling for each stroke based on its image for a particular syllable by comparing incoming frames to the reference image using the structural similarity index. Based on this the syllables are classified and automatic transcription is done. Using the proposed method, we are proficiently able to transcript 97.14% of the tabla syllables with F1 score of 0.98.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Akbari M, Cheng H (2015) Real-time piano music transcription based on computer vision. IEEE Trans Multimed 17(12):2113–2121

    Article  Google Scholar 

  2. Akbari M, Liang J, Cheng H A real-time system for online learning-based visual transcription of piano music. Multimed Tools Appl. https://doi.org/10.1007/s11042-018-5803-1

  3. Bello JP (2003) Towards the automated analysis of simple polyphonic music: a knowledge-based approach (ph.d thesis)

  4. Bello JP, Daudet L, Abdallah S, Duxbury C, Davies M, Sandler MB (2005) A tutorial on onset detection in music signals. IEEE Trans Speech Audio Process 13(5):1035–1047

    Article  Google Scholar 

  5. Benetos E, Dixon S, Giannoulis D, Kirchhoff H, Klapuri A (2013) Automatic music transcription: challenges and future directions. J Intell Inf Syst 41:407–434

    Article  Google Scholar 

  6. Benetos E, Dixon S, Duan Z, Ewert S (2019) Automatic music transcription: an overview. IEEE Signal Process Mag 36(1):20–30

    Article  Google Scholar 

  7. Beronja S (2008) The art of the Indian tabla. Rupa and Co

  8. Chordia P (2005) Segmentation and recognition of tabla strokes. In: Proceedings of ISMIR 2003, 4th international conference on music information retrieval, London, pp 385–391

  9. Dressler K (2012) Multiple fundamental frequency extraction for mirex 2012

  10. Downie JS (2003) Music information retrieval. Annu Rev Inf Sci Technol 37(1):295–340

    Article  Google Scholar 

  11. Fan D, Liu J, Gao S, Hou Q, Borji A, Cheng M Salient objects in clutter: bringing salient object detection to the foreground, CoRR arXiv:1803.06091

  12. Fernandes Tavares T, Garcia Arnal Barbedo J, Attux R, Lopes A (2013) Survey on automatic transcription of music. J Braz Comput Soc 19(4):589–604

    Article  Google Scholar 

  13. Frisson C, Reboursière L, Chu WY, Lähdeoja O, Anderson MJ, Picard-Limpens C, Shen A, Todoroff T (2009) Multimodal guitar: performance toolbox and study workbench

  14. Fu Z, Lu G, Ting KM, Zhang D (2011) A survey of audio-based music classification and annotation. IEEE Trans Multimed 13(2):303–319

    Article  Google Scholar 

  15. Gillet O, Richard G (2003) Automatic labeling of tabla signals. In: Proceedings of ISMIR 2003, 4th international conference on music information retrieval, Baltimore

  16. Gillet O, Richard G (2005) Automatic transcription of drum sequences using audiovisual features. In: IEEE international conference on acoustics, speech, and signal processing 3, vol 3, pp iii/205–iii/208

  17. Gupta S, Srinivasamurthy A, Manoj Kumar PA, Murthy HA, Serra X (2015) Discovery of syllabic percussion patterns in tabla solo recordings. In: Proceedings of ISMIR 2015, 16th international conference on music information retrieval, Malaga

  18. Maher RC (1990) Evaluation of a method for separating digitized duet signals. J Audio Eng Soc 12(38):956–979

    Google Scholar 

  19. Moorer JA (1977) On the transcription of musical sound by computer. Comput Music J 3(4):32–38

    Google Scholar 

  20. Paleari M, Huet B, Schutz A, Slock D (2008) A multimodal approach to music transcription. In: 2008 15th IEEE international conference on image processing, pp 93–96

  21. Peeling PH, Godsill SJ (2011) Multiple pitch estimation using non-homogeneous poisson processes. IEEE J Sel Top Signal Process 5(6):1133–1143

    Article  Google Scholar 

  22. Piszczalski M, Galler BA (1977) Automatic music transcription. Comput Music J 4(1):24–31

    Google Scholar 

  23. Raman CV (1934) The Indian musical drums. Proc Indian Acad Sci—Sect A 1(3):179–188

    Article  MathSciNet  Google Scholar 

  24. Sarkar R, Singh A, Mondal A, Saha SK (2018) Automatic extraction and identification of bol from tabla signal. In: Advanced computing and systems for security: volume five, pp 139–151

  25. Scaringella N, Zoia G, Mlynek D (2006) Automatic genre classification of music content: a survey. IEEE Signal Process Mag 23(2):133–141

    Article  Google Scholar 

  26. Schedl M, Gómez E, Urbano J (2014) Music information retrieval: recent developments and applications. Found Trends Inf Retr 8(2–3):127–261

    Article  Google Scholar 

  27. Suteparuk P (2014) Detection of piano keys pressed in video. Dept. of Comput. Sci., Stanford Univ., Stanford, Tech. Rep.

  28. Tomasi C, Kanade T (1991) Detection and tracking of point features. Tech. rep. International Journal of Computer Vision

  29. Quenneville D (2018) Automatic music transcription. Bachelor of arts (thesis report). Middlebury College

  30. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612

    Article  Google Scholar 

  31. Zhang B, Zhu J, Wang Y, Leow WK (2007) Visual analysis of fingering for pedagogical violin transcription. In: Proceedings of the 15th ACM international conference on multimedia, MM ’07, pp 521–524

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Raghavendra Bhalarao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhalarao, R., Raval, M. Automated tabla syllable transcription using image processing techniques. Multimed Tools Appl 79, 28885–28899 (2020). https://doi.org/10.1007/s11042-020-09417-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09417-0

Keywords

Navigation