Abstract
One approach to Automatic Music Transcription (AMT) is to decompose a spectrogram with a dictionary matrix that contains a pitch-labelled note spectrum atom in each column. AMT performance is typically measured using frame-based comparison, while an alternative perspective is to use an event-based analysis. We have previously proposed an AMT system, based on the use of structured sparse representations. The method is described and experimental results are given, which are seen to be promising. An inspection of the graphical AMT output known as a piano roll may lead one to think that the performance may be slightly better than is suggested by the AMT metrics used. This leads us to perform an oracle analysis of the AMT system, with some interesting outcomes which may have implications for decomposition based AMT in general.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aharon, M., Elad, M., Bruckstein, A.M.: K-SVD and its Non-negative Variant for Dictionary Design. In: Proceedings of the SPIE Conference Wavelets, pp. 327-339 (2005)
Daudet, L.: Sparse and Structured Decompositions of Signals with the Molecular Matching Pursuit. IEEE Transactions on Audio, Speech and Language Processing 14(5), 1808–1816 (2006)
Emiya, V., Badeau, R., David, B.: Multipitch Estimation of Piano Sounds using a New Probabilistic Spectral Smoothness Principle. IEEE Transactions on Audio, Speech and Language 18(6), 1643–1654 (2010)
Leveau, P., Vincent, E., Richard, G., Daudet, L.: Instrument-Specific Harmonic Atoms for Mid-Level Music Representation. IEEE Transactions on Audio, Speech and Language 16(1), 116–128 (2008)
Poliner, G., Ellis, D.: A Discrimative Model for Polyphonic Piano Transcription. EURASIP Journal Advances in Signal Processing (8), 154–162 (2007)
Smaragdis, P., Brown, J.C.: Non-negative Matrix Factorization for Polyphonic Music Transcription. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (2003)
O’Hanlon, K., Nagano, H., Plumbley, M.D.: Structured Sparsity for Automatic Music Transcription. In: IEEE International Conference on Audio, Speech and Signal Processing (2012)
Abdallah, S.A., Plumbley, M.D.: Polyphonic Transcription by Non-negative Sparse Coding of Power Spectra. In: Proceedings of ISMIR, pp. 318–325 (2004)
Benetos, E., Dixon, S.: Multiple-Instrument Polyphonic Music Transcription using a Convolutive Probabilistic Model. In: Proceedings of the Sound and Music Computing Conference (2011)
Bertin, N., Badeau, R., Vincent, E.: Enforcing Harmonicity and Smoothness in Bayesian Non-negative Matrix Factorization applied to Polyphonic Music Transcription. IEEE Transactions on Audio, Speech, and Language Processing 18(3), 538–549 (2010)
Pati, Y.C., Rezaiifar, R.: Orthogonal Matching Pursuit: Recursive Function Approximation with Applications to Wavelet Decomposition. In: Proceedings of the 27th Annual Asilomar Conference on Signals, Systems and Computers, pp. 40–44 (1993)
Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic Decomposition by Basis Pursuit. SIAM Journal on Scientific Computing 20, 33–61 (1998)
Vincent, E., Bertin, N., Badeau, R.: Adaptive Harmonic Spectral Decomposition for Multiple Pitch Estimation. IEEE Transactions on Audio, Speech and Language Processing 18(3), 528–537 (2010)
Hoyer, P.O.: Non-negative sparse coding. In: Proceedings of the 2002 IEEE Workshop on Neural Networks for Signal Processing, pp. 557–565 (2002)
Lee, D.D., Seung, H.S.: Algorithms for Non-negative Matrix Factorization. In: Advances in Neural Information Processing Systems (NIPS), pp. 556–562 (2000)
O’Hanlon, K., Nagano, H., Plumbley, M.D.: Group Non-negative Basis Pursuit for Automatic Music Transcription. In: Proceedings of the Workshop on Music and Machine Learning (MML) at ICML (2012)
O’Hanlon, K., Plumbley, M.D.: Greedy Non-negative Group Sparsity. In: Proceedings of the 3rd IMA Conference on Numerical Linear Algebra and Optimisation (2012)
Bock, S., Schedl, M.: Polyphonic Piano Note Transcription with Recurrent Neural Networks. In: Proceedings of the 2012 International Conference on Acoustics, Speech and Signal Processing, pp. 121–124 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
O’Hanlon, K., Nagano, H., Plumbley, M.D. (2013). Using Oracle Analysis for Decomposition-Based Automatic Music Transcription. In: Aramaki, M., Barthet, M., Kronland-Martinet, R., Ystad, S. (eds) From Sounds to Music and Emotions. CMMR 2012. Lecture Notes in Computer Science, vol 7900. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41248-6_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-41248-6_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41247-9
Online ISBN: 978-3-642-41248-6
eBook Packages: Computer ScienceComputer Science (R0)