skip to main content
10.1145/3457682.3457710acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlcConference Proceedingsconference-collections
research-article

An EB-enhanced CNN Model for Piano Music Transcription

Authors Info & Claims
Published:21 June 2021Publication History

ABSTRACT

Automatic Music Transcription (AMT) is an important task in Music Information Retrieval (MIR). Many researchers have focused on the structure of Convolutional Neural Network (CNN) for transcription. In this paper, we construct a CNN-based (EB) piano music transcription model using the energy-balanced constant Q transform spectrogram, which is called EB-enhanced CNN model. Unlike standard CNN-based methods, the proposed model makes the energy of the input features more balanced, so that many previously missed pitches due to weak energy can be successfully detected. Training and evaluation are performed on the MAPS dataset, a public dataset for piano transcription. As a result, our technique achieves a 3.53% f1 score improvement compared with the state-of-the-art method on the MAPS ENSTDkCl subset.

References

  1. A. Cogliati, Z. Duan, and B. Wohlberg, “Piano transcription with convolutional sparse lateral inhibition,” IEEE Signal Processing Letters, vol 24, no 4, pp. 392-396, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  2. F. Cong, S. Liu, and L. Guo, “A Parallel Fusion Approach to Piano Music Transcription based on Convolutional Neural Network,” in ICASSP, pp. 125131, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. E. Benetos, S. Dixon, Z. Duan, and S. Ewert, “Automatic Music Transcription: An Overview,” IEEE Signal Processing Magazine, vol 36, no 1, pp. 20-30, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  4. S. C. Kong, W. Xu, W. Liu, X. Gong, and J. T. Liu, “Onset-Aware Polyphonic Piano Transcription: A CNN-Based Approach,” the 9th International Workshop on Computer Science and Engineering, pp. 454-461, 2019.Google ScholarGoogle Scholar
  5. K. Dressler, “Multiple fundamental frequency extraction for MIREX 2012”, Eighth Music Information Retrieval Evaluation eXchange (MIREX), 2012.Google ScholarGoogle Scholar
  6. P. Smaragdis and J.C. Brown, “Non-negative matrix factorization for polyphonic music transcription”, in WASPAA, pp. 177-180, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  7. S. Ewert, “An augmented Lagrangian method for piano transcription using equal loudness thresholding and LSTM-based decoding”, in WASPAA, pp. 146-150, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  8. P. H. Peeling and A. T. Cemgil, “Generative spectrogram factorization models for polyphonic piano transcription,” IEEE transactions on audio, speech, and language processing, vol 18, no 3, pp. 519-527, 2009.Google ScholarGoogle Scholar
  9. S. Sigtia, E. Benetos and S. Dixon, “An end-to-end neural network for polyphonic piano music transcription,” IEEE/ACM Transactions on Audio, Speech, and Language, vol 24, no 5, pp. 927-939, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Sigtia and E. Benetos, “A hybrid recurrent neural network for music transcription,” IEEE international conference on acoustics, speech and signal, pp. 2061-2065, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  11. C. Hawthorne, E. Elsen, and J. Song, “Onsets and frames: Dual-objective piano transcription”, in arXiv preprint 1710.11153, 2017.Google ScholarGoogle Scholar
  12. R. Kelz, S. Böck, and G. Widmer, “Deep polyphonic adsr piano note transcription,” IEEE International Conference on Acoustics, Speech and Signal, pp. 246250, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  13. C. Hawthorne, E. Elsen, and J. Song, “Onsets and frames: Dual-objective piano transcription,” the 19th ISMIR Conference, 2018.Google ScholarGoogle Scholar
  14. C. Z. A. Huang and A. Vaswani, “Music transformer: Generating music with long-term structure,” International Conference on Learning Representations, pp. 102-110, 2019.Google ScholarGoogle Scholar
  15. X. Gong, W. Xu, and J. T. Liu, “ANALYSIS AND CORRECTION OF MAPS DATASET,” the 22nd International Conference on Digital Audio Effects, 2019.Google ScholarGoogle Scholar
  16. V. Emiya, R. Badeau, and B. David, “Multipitch estimation of piano sounds using a new probabilistic spectral smoothness principle,” IEEE Transactions on Audio, Speech, and Language Processing, vol 18, no 6, pp. 1643-1654, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  17. Q. Wang, R. Zhou, and Y. Yan, “A two-stage approach to note-level transcription of a specific piano,” Applied Sciences, vol 7, no 9, pp. 901, 2017.Google ScholarGoogle ScholarCross RefCross Ref

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    ICMLC '21: Proceedings of the 2021 13th International Conference on Machine Learning and Computing
    February 2021
    601 pages
    ISBN:9781450389310
    DOI:10.1145/3457682

    Copyright © 2021 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 21 June 2021

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited
  • Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)2

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format