skip to main content
10.1145/3409501.3409540acmotherconferencesArticle/Chapter ViewAbstractPublication PageshpcctConference Proceedingsconference-collections
research-article

Robust Piano Music Transcription Based on Computer Vision

Authors Info & Claims
Published:25 August 2020Publication History

ABSTRACT

Recently, automatic music transcription aiming to convert acoustic music signals into symbolic notations attracts increasing attention. In order to deal with the challenges of automatic music transcription based on acoustic information, traditional approaches adopt hough transform to locate the piano keyboard and a weak classifier to detect pressed keys. However, the hough transform and weak classifier show insufficient detection ability in the changing environment. In this paper, we devise a robust visual piano transcription system using semantic segmentation for the piano keyboard detection and a CNN-based classifier to detect the pressed keys, which improves the frame-level transcription results. In addition, in view of lacking public datasets in the field of visual piano transcription, we further propose a new dataset for visual piano transcription. To demonstrate the effectiveness of our system, we evaluate it on both the published dataset and we proposed, and our system significantly outperforms the state-of-the-art approaches.

References

  1. Benetos, Emmanouil, et al. "Automatic music transcription: challenges and future directions." Journal of Intelligent Information Systems 41.3 (2013): 407--434.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Cheng, Tian, et al. "An attack/decay model for piano transcription." ISMIR, 2016.Google ScholarGoogle Scholar
  3. Suteparuk, Potcharapol. "Detection of piano keys pressed in video." Dept. of Comput. Sci., Stanford Univ., Stanford, CA, USA, Tech. Rep (2014).Google ScholarGoogle Scholar
  4. Akbari, Mohammad, and Howard Cheng. "Clavision: visual automatic piano music transcription." NIME. 2015.Google ScholarGoogle Scholar
  5. Akbari, Mohammad, and Howard Cheng. "Real-time piano music transcription based on computer vision." IEEE Transactions on Multimedia 17.12 (2015): 2113--2121.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Akbari, Mohammad, Jie Liang, and Howard Cheng. "A real-time system for online learning-based visual transcription of piano music." Multimedia Tools and Applications 77.19 (2018): 25513--25535.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Moorer, James A. "On the transcription of musical sound by computer." Computer Music Journal (1977): 32--38.Google ScholarGoogle Scholar
  8. Goodwin, Adam, and Richard Green. "Key detection for a virtual piano teacher." 2013 28th International Conference on Image and Vision Computing New Zealand (IVCNZ 2013). IEEE, 2013.Google ScholarGoogle Scholar
  9. Vishal, Boga, and K. Deepak Lawrence. "Paper piano---Shadow analysis based touch interaction." 2017 2nd International Conference on Man and Machine Interfacing (MAMI). IEEE, 2017.Google ScholarGoogle Scholar
  10. Kang S, Kim J, Yoon S. Virtual Piano using Computer Vision[J]. arXiv preprint arXiv:1910.12539, 2019.Google ScholarGoogle Scholar
  11. Frisson, Christian, et al. "Multimodal guitar: Performance toolbox and study workbench." QPSR of the numediart research program. Ed. by Thierry Dutoit and Benoît Macq 2 (2009): 3.Google ScholarGoogle Scholar
  12. Paleari, Marco, et al. "A multimodal approach to music transcription." 2008 15th IEEE International Conference on Image Processing. IEEE, 2008.Google ScholarGoogle Scholar
  13. Wan, Yu Long, et al. "Automatic transcription of piano music using audio-vision fusion." Applied Mechanics and Materials. Vol. 333. Trans Tech Publications, 2013.Google ScholarGoogle Scholar
  14. Wan, Yulong, et al. "Automatic Piano Music Transcription Using Audio-Visual Features." Chinese Journal of Electronics24.3 (2015): 596--603.Google ScholarGoogle ScholarCross RefCross Ref
  15. Lee, Jangwon, et al. "Observing Pianist Accuracy and Form with Computer Vision." 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2019.Google ScholarGoogle Scholar
  16. Zhao, Hengshuang, et al. "Pyramid scene parsing network." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.Google ScholarGoogle Scholar
  17. Szegedy, Christian, et al. "Rethinking the inception architecture for computer vision." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.Google ScholarGoogle Scholar
  18. Wada K. labelme: Image Polygonal Annotation with Python[J]. 2016.Google ScholarGoogle Scholar

Index Terms

  1. Robust Piano Music Transcription Based on Computer Vision

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      HPCCT & BDAI '20: Proceedings of the 2020 4th High Performance Computing and Cluster Technologies Conference & 2020 3rd International Conference on Big Data and Artificial Intelligence
      July 2020
      276 pages
      ISBN:9781450375603
      DOI:10.1145/3409501

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 August 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader