Skip to main content

Evaluation of the Convolutional NMF for Supervised Polyphonic Music Transcription and Note Isolation

  • Conference paper
  • First Online:
Latent Variable Analysis and Signal Separation (LVA/ICA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9237))

  • 2505 Accesses

Abstract

We evaluate the convolutive nonnegative matrix factorization in the context of automatic music transcription of polyphonic piano recordings and the associated problem of note isolation. Our intention is to find out whether the temporal continuity of piano notes is truthfully captured by the convolutional kernels and how the performance scales with complexity. Systematic studies of this kind are lacking in existing literature. We make use of established measures of accuracy and similarity. NMF dictionaries covering the piano’s pitch range are learned from a given sample bank of isolated notes. The kernel alias patch size is varied. By using a measure of performance advantage, we show up that the improvements due to convolved bases do not justify the extra computational effort as compared to the standard NMF. In particular, this is true for the more realistic case, in which the dictionary does not fully correspond to the mixture signal. Further pertinent conclusions are drawn as well.

S. Gorlow is now with Sony Computer Science Laboratory (CSL) in Paris, France.

This work was funded in part by the Yamaha Corporation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://staff.aist.go.jp/m.goto/RWC-MDB/.

  2. 2.

    http://www.mpi-inf.mpg.de/resources/SMD/SMD_MIDI-Audio-Piano-Music.html.

  3. 3.

    The results shown are representative of what we experienced for different piano recordings.

  4. 4.

    https://code.google.com/p/nmflib/.

  5. 5.

    The number was chosen empirically. Above it, no significant improvement was observed.

References

  1. Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)

    Article  Google Scholar 

  2. Smaragdis, P., Brown, J.C.: Non-negative matrix factorization for polyphonic music transcription. In: Proceedings of the WASPAA 2003, pp. 177–180, October 2003

    Google Scholar 

  3. Abdallah, S.A., Plumbley, M.D.: “Polyphonic music transcription by non-negative sparse coding of power spectra. In: Proceedings of the ISMIR 2004, pp. 318–325, October 2004

    Google Scholar 

  4. Smaragdis, P.: Non-negative matrix factor deconvolution; extraction of multiple sound sources from monophonic inputs. In: Puntonet, C.G., Prieto, A.G. (eds.) ICA 2004. LNCS, vol. 3195, pp. 494–499. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  5. Smaragdis, P.: Convolutive speech bases and their application to supervised speech separation. IEEE Audio, Speech, Lang. Process. 15(1), 1–12 (2007)

    Article  Google Scholar 

  6. Huber, R., Kollmeier, B.: PEMO-Q – a new method for objective audio quality assessment using a model of auditory perception. IEEE Audio, Speech, Lang. Process. 14(6), 1902–1911 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stanislaw Gorlow .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Gorlow, S., Janer, J. (2015). Evaluation of the Convolutional NMF for Supervised Polyphonic Music Transcription and Note Isolation. In: Vincent, E., Yeredor, A., Koldovský, Z., Tichavský, P. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2015. Lecture Notes in Computer Science(), vol 9237. Springer, Cham. https://doi.org/10.1007/978-3-319-22482-4_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22482-4_51

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22481-7

  • Online ISBN: 978-3-319-22482-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics