Abstract
This paper discusses techniques for pattern induction and matching in musical audio. At all levels of music - harmony, melody, rhythm, and instrumentation - the temporal sequence of events can be subdivided into shorter patterns that are sometimes repeated and transformed. Methods are described for extracting such patterns from musical audio signals (pattern induction) and computationally feasible methods for retrieving similar patterns from a large database of songs (pattern matching).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Abesser, J., Lukashevich, H., Dittmar, C., Schuller, G.: Genre classification using bass-related high-level features and playing styles. In: Intl. Society on Music Information Retrieval Conference, Kobe, Japan (2009)
Badeau, R., Emiya, V., David, B.: Expectation-maximization algorithm for multi-pitch estimation and separation of overlapping harmonic spectra. In: Proc. IEEE ICASSP, Taipei, Taiwan, pp. 3073–3076 (2009)
Barbour, J.: Analytic listening: A case study of radio production. In: International Conference on Auditory Display, Sydney, Australia (July 2004)
Barry, D., Lawlor, B., Coyle, E.: Sound source separation: Azimuth discrimination and resynthesis. In: 7th International Conference on Digital Audio Effects, Naples, Italy, pp. 240–244 (October 2004)
Bartsch, M.A., Wakefield, G.H.: To catch a chorus: Using chroma-based representations for audio thumbnailing. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, USA, pp. 15–18 (2001)
Begleiter, R., El-Yaniv, R., Yona, G.: On prediction using variable order Markov models. J. of Artificial Intelligence Research 22, 385–421 (2004)
Bertin-Mahieux, T., Weiss, R.J., Ellis, D.P.W.: Clustering beat-chroma patterns in a large music database. In: Proc. of the Int. Society for Music Information Retrieval Conference, Utrecht, Netherlands (2010)
Bever, T.G., Chiarello, R.J.: Cerebral dominance in musicians and nonmusicians. The Journal of Neuropsychiatry and Clinical Neurosciences 21(1), 94–97 (2009)
Brown, J.C.: Calculation of a constant Q spectral transform. J. Acoust. Soc. Am. 89(1), 425–434 (1991)
Burred, J., Röbel, A., Sikora, T.: Dynamic spectral envelope modeling for the analysis of musical instrument sounds. IEEE Trans. Audio, Speech, and Language Processing (2009)
de Cheveigné, A.: Multiple F0 estimation. In: Wang, D., Brown, G.J. (eds.) Computational Auditory Scene Analysis: Principles, Algorithms and Applications. Wiley–IEEE Press (2006)
Dannenberg, R.B., Goto, M.: Music structure analysis from acoustic signals. In: Havelock, D., Kuwano, S., Vorländer, M. (eds.) Handbook of Signal Processing in Acoustics, pp. 305–331. Springer, Heidelberg (2009)
Dannenberg, R.B., Hu, N.: Pattern discovery techniques for music audio. Journal of New Music Research 32(2), 153–163 (2003)
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.: Locality-sensitive hashing scheme based on p-stable distributions. In: ACM Symposium on Computational Geometry, pp. 253–262 (2004)
Dixon, S., Pampalk, E., Widmer, G.: Classification of dance music by periodicity patterns. In: 4th International Conference on Music Information Retrieval, Baltimore, MD, pp. 159–165 (2003)
Downie, J.S.: The music information retrieval evaluation exchange (2005–2007): A window into music information retrieval research. Acoustical Science and Technology 29(4), 247–255 (2008)
Dressler, K.: An auditory streaming approach on melody extraction. In: Intl. Conf. on Music Information Retrieval, Victoria, Canada (2006); MIREX evaluation
Duda, A., Nürnberger, A., Stober, S.: Towards query by humming/singing on audio databases. In: International Conference on Music Information Retrieval, Vienna, Austria, pp. 331–334 (2007)
Durrieu, J.L., Ozerov, A., Févotte, C., Richard, G., David, B.: Main instrument separation from stereophonic audio signals using a source/filter model. In: Proc. EUSIPCO, Glasgow, Scotland (August 2009)
Durrieu, J.L., Richard, G., David, B., Fevotte, C.: Source/filter model for unsupervised main melody extraction from polyphonic audio signals. IEEE Trans. on Audio, Speech, and Language Processing 18(3), 564–575 (2010)
Ellis, D., Arroyo, J.: Eigenrhythms: Drum pattern basis sets for classification and generation. In: International Conference on Music Information Retrieval, Barcelona, Spain
Ellis, D.P.W., Poliner, G.: Classification-based melody transcription. Machine Learning 65(2-3), 439–456 (2006)
FitzGerald, D., Cranitch, M., Coyle, E.: Extended nonnegative tenson factorisation models for musical source separation. Computational Intelligence and Neuroscience (2008)
Fujihara, H., Goto, M.: A music information retrieval system based on singing voice timbre. In: Intl. Conf. on Music Information Retrieval, Vienna, Austria (2007)
Gersho, A., Gray, R.: Vector Quantization and Signal Compression. Kluwer Academic Publishers, Dordrecht (1991)
Ghias, A., Logan, J., Chamberlin, D.: Query by humming: Musical information retrieval in an audio database. In: ACM Multimedia Conference 1995. Cornell University, San Fransisco (1995)
Goto, M.: A chorus-section detecting method for musical audio signals. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, China, vol. 5, pp. 437–440 (April 2003)
Goto, M.: A real-time music scene description system: Predominant-F0 estimation for detecting melody and bass lines in real-world audio signals. Speech Communication 43(4), 311–329 (2004)
Guo, L., He, X., Zhang, Y., Lu, Y.: Content-based retrieval of polyphonic music objects using pitch contour. In: IEEE International Conference on Audio, Speech and Signal Processing, Las Vegas, USA, pp. 2205–2208 (2008)
Hainsworth, S.W., Macleod, M.D.: Automatic bass line transcription from polyphonic music. In: International Computer Music Conference, Havana, Cuba, pp. 431–434 (2001)
Helén, M., Virtanen, T.: Separation of drums from polyphonic music using non-negtive matrix factorization and support vector machine. In: European Signal Processing Conference, Antalya, Turkey (2005)
Jang, J.S.R., Gao, M.Y.: A query-by-singing system based on dynamic programming. In: International Workshop on Intelligent Systems Resolutions (2000)
Jang, J.S.R., Hsu, C.L., Lee, H.R.: Continuous HMM and its enhancement for singing/humming query retrieval. In: 6th International Conference on Music Information Retrieval, London, UK (2005)
Jensen, K.: Multiple scale music segmentation using rhythm, timbre, and harmony. EURASIP Journal on Advances in Signal Processing (2007)
Jurafsky, D., Martin, J.H.: Speech and language processing. Prentice Hall, New Jersey (2000)
Kitahara, T., Goto, M., Komatani, K., Ogata, T., Okuno, H.G.: Instrogram: Probabilistic representation of instrument existence for polyphonic music. IPSJ Journal 48(1), 214–226 (2007)
Klapuri, A.: A method for visualizing the pitch content of polyphonic music signals. In: Intl. Society on Music Information Retrieval Conference, Kobe, Japan (2009)
Klapuri, A., Davy, M. (eds.): Signal Processing Methods for Music Transcription. Springer, New York (2006)
Klapuri, A., Eronen, A., Astola, J.: Analysis of the meter of acoustic musical signals. IEEE Trans. Speech and Audio Processing 14(1) (2006)
Lartillot, O., Dubnov, S., Assayag, G., Bejerano, G.: Automatic modeling of musical style. In: International Computer Music Conference (2001)
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)
Lemström, K.: String Matching Techniques for Music Retrieval. Ph.D. thesis, University of Helsinki (2000)
Lerdahl, F., Jackendoff, R.: A Generative Theory of Tonal Music. MIT Press, Cambridge (1983)
Leveau, P., Vincent, E., Richard, G., Daudet, L.: Instrument-specific harmonic atoms for mid-level music representation. IEEE Trans. Audio, Speech, and Language Processing 16(1), 116–128 (2008)
Li, Y., Wang, D.L.: Separation of singing voice from music accompaniment for monaural recordings. IEEE Trans. on Audio, Speech, and Language Processing 15(4), 1475–1487 (2007)
Marolt, M.: Audio melody extraction based on timbral similarity of melodic fragments. In: EUROCON (November 2005)
Mauch, M., Noland, K., Dixon, S.: Using musical structure to enhance automatic chord transcription. In: Proc. 10th Intl. Society for Music Information Retrieval Conference, Kobe, Japan (2009)
McNab, R., Smith, L., Witten, I., Henderson, C., Cunningham, S.: Towards the digital music library: Tune retrieval from acoustic input. In: First ACM International Conference on Digital Libraries, pp. 11–18 (1996)
Meek, C., Birmingham, W.: Applications of binary classification and adaptive boosting to the query-by-humming problem. In: Intl. Conf. on Music Information Retrieval, Paris, France (2002)
Müller, M., Ewert, S., Kreuzer, S.: Making chroma features more robust to timbre changes. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Taipei, Taiwan, pp. 1869–1872 (April 2009)
Nishimura, T., Hashiguchi, H., Takita, J., Zhang, J.X., Goto, M., Oka, R.: Music signal spotting retrieval by a humming query using start frame feature dependent continuous dynamic programming. In: 2nd Annual International Symposium on Music Information Retrieval, Bloomington, Indiana, USA, pp. 211–218 (October 2001)
Ono, N., Miyamoto, K., Roux, J.L., Kameoka, H., Sagayama, S.: Separation of a monaural audio signal into harmonic/percussive components by complementary diffucion on spectrogram. In: European Signal Processing Conference, Lausanne, Switzerland, pp. 240–244 (August 2008)
Ozerov, A., Philippe, P., Bimbot, F., Gribonval, R.: Adaptation of Bayesian models for single-channel source separation and its application to voice/music separation in popular songs. IEEE Trans. on Audio, Speech, and Language Processing 15(5), 1564–1578 (2007)
Paiva, R.P., Mendes, T., Cardoso, A.: On the detection of melody notes in polyphonic audio. In: 6th International Conference on Music Information Retrieval, London, UK, pp. 175–182
Paulus, J.: Signal Processing Methods for Drum Transcription and Music Structure Analysis. Ph.D. thesis, Tampere University of Technology (2009)
Paulus, J., Klapuri, A.: Measuring the similarity of rhythmic patterns. In: Intl. Conf. on Music Information Retrieval, Paris, France (2002)
Paulus, J., Müller, M., Klapuri, A.: Audio-based music structure analysis. In: Proc. of the Int. Society for Music Information Retrieval Conference, Utrecht, Netherlands (2010)
Paulus, J., Virtanen, T.: Drum transcription with non-negative spectrogram factorisation. In: European Signal Processing Conference, Antalya, Turkey (September 2005)
Peeters, G.: Sequence representations of music structure using higher-order similarity matrix and maximum-likelihood approach. In: Intl. Conf. on Music Information Retrieval, Vienna, Austria, pp. 35–40 (2007)
Peeters, G.: A large set of audio features for sound description (similarity and classification) in the CUIDADO project. Tech. rep., IRCAM, Paris, France (April 2004)
Poliner, G., Ellis, D., Ehmann, A., Gómez, E., Streich, S., Ong, B.: Melody transcription from music audio: Approaches and evaluation. IEEE Trans. on Audio, Speech, and Language Processing 15(4), 1247–1256 (2007)
Purwins, H.: Profiles of Pitch Classes – Circularity of Relative Pitch and Key: Experiments, Models, Music Analysis, and Perspectives. Ph.D. thesis, Berlin University of Technology (2005)
Rowe, R.: Machine musicianship. MIT Press, Cambridge (2001)
Ryynänen, M., Klapuri, A.: Query by humming of MIDI and audio using locality sensitive hashing. In: IEEE International Conference on Audio, Speech and Signal Processing, Las Vegas, USA, pp. 2249–2252
Ryynänen, M., Klapuri, A.: Transcription of the singing melody in polyphonic music. In: Intl. Conf. on Music Information Retrieval, Victoria, Canada, pp. 222–227 (2006)
Ryynänen, M., Klapuri, A.: Automatic bass line transcription from streaming polyphonic audio. In: IEEE International Conference on Audio, Speech and Signal Processing, pp. 1437–1440 (2007)
Ryynänen, M., Klapuri, A.: Automatic transcription of melody, bass line, and chords in polyphonic music. Computer Music Journal 32(3), 72–86 (2008)
Schörkhuber, C., Klapuri, A.: Constant-Q transform toolbox for music processing. In: 7th Sound and Music Computing Conference, Barcelona, Spain (2010)
Selfridge-Field, E.: Conceptual and representational issues in melodic comparison. Computing in Musicology 11, 3–64 (1998)
Serra, J., Gomez, E., Herrera, P., Serra, X.: Chroma binary similarity and local alignment applied to cover song identification. IEEE Trans. on Audio, Speech, and Language Processing 16, 1138–1152 (2007)
Serra, X.: Musical sound modeling with sinusoids plus noise. In: Roads, C., Pope, S., Picialli, A., Poli, G.D. (eds.) Musical Signal Processing, Swets & Zeitlinger (1997)
Song, J., Bae, S.Y., Yoon, K.: Mid-level music melody representation of polyphonic audio for query-by-humming system. In: Intl. Conf. on Music Information Retrieval, Paris, France, pp. 133–139 (October 2002)
Tokuda, K., Kobayashi, T., Masuko, T., Imai, S.: Mel-generalized cepstral analysis – a unified approach to speech spectral estimation. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, Adelaide, Australia (1994)
Typke, R.: Music Retrieval based on Melodic Similarity. Ph.D. thesis, Universiteit Utrecht (2007)
Vincent, E., Bertin, N., Badeau, R.: Harmonic and inharmonic nonnegative matrix factorization for polyphonic pitch transcription. In: IEEE ICASSP, Las Vegas, USA (2008)
Virtanen, T.: Unsupervised learning methods for source separation in monaural music signals. In: Klapuri, A., Davy, M. (eds.) Signal Processing Methods for Music Transcription, pp. 267–296. Springer, Heidelberg (2006)
Virtanen, T.: Monaural sound source separation by non-negative matrix factorization with temporal continuity and sparseness criteria. IEEE Trans. Audio, Speech, and Language Processing 15(3), 1066–1074 (2007)
Virtanen, T., Mesaros, A., Ryynänen, M.: Combining pitch-based inference and non-negative spectrogram factorization in separating vocals from polyphonic music. In: ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition, Brisbane, Australia (September 2008)
Wang, L., Huang, S., Hu, S., Liang, J., Xu, B.: An effective and efficient method for query by humming system based on multi-similarity measurement fusion. In: International Conference on Audio, Language and Image Processing, pp. 471–475 (July 2008)
Welch, T.A.: A technique for high-performance data compression. Computer 17(6), 8–19 (1984)
Wu, X., Li, M., Yang, J., Yan, Y.: A top-down approach to melody match in pitch countour for query by humming. In: International Conference of Chinese Spoken Language Processing (2006)
Yeh, C.: Multiple fundamental frequency estimation of polyphonic recordings. Ph.D. thesis, University of Paris VI (2008)
Yilmaz, O., Richard, S.: Blind separation of speech mixtures via time-frequency masking. IEEE Trans. on Signal Processing 52(7), 1830–1847 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Klapuri, A. (2011). Pattern Induction and Matching in Music Signals. In: Ystad, S., Aramaki, M., Kronland-Martinet, R., Jensen, K. (eds) Exploring Music Contents. CMMR 2010. Lecture Notes in Computer Science, vol 6684. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23126-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-23126-1_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23125-4
Online ISBN: 978-3-642-23126-1
eBook Packages: Computer ScienceComputer Science (R0)