research-article

Using quadratic programming to estimate feature relevance in structural analyses of music

Authors:
Jordan B.L. Smith

Queen Mary University of London, London, United Kingdom

Queen Mary University of London, London, United Kingdom
View Profile

,
Elaine Chew

Queen Mary University of London, London, United Kingdom

Queen Mary University of London, London, United Kingdom
View Profile

MM '13: Proceedings of the 21st ACM international conference on MultimediaOctober 2013Pages 113–122https://doi.org/10.1145/2502081.2502124

Published:21 October 2013Publication History

MM '13: Proceedings of the 21st ACM international conference on Multimedia

Pages 113–122

ABSTRACT

To identify repeated patterns and contrasting sections in music, it is common to use self-similarity matrices (SSMs) to visualize and estimate structure. We introduce a novel application for SSMs derived from audio recordings: using them to learn about the potential reasoning behind a listener's annotation. We use SSMs generated by musically-motivated audio features at various timescales to represent contributions to a structural annotation. Since a listener's attention can shift among musical features (e.g., rhythm, timbre, and harmony) throughout a piece, we further break down the SSMs into section-wise components and use quadratic programming (QP) to minimize the distance between a linear sum of these components and the annotated description. We posit that the optimal section-wise weights on the feature components may indicate the features to which a listener attended when annotating a piece, and thus may help us to understand why two listeners disagreed about a piece's structure. We discuss some examples that substantiate the claim that feature relevance varies throughout a piece, using our method to investigate differences between listeners' interpretations, and lastly propose some variations on our method.

References

Bartsch, M. and Wakefield, G. 200 To catch a chorus: using chroma-based representations for audio thumbnailing. IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (New Paltz, NY, USA). 15--18.Google Scholar
Bruderer, M., McKinney, M., and Kohlrausch, A. 2009. The perception of structural boundaries in melody lines of Western popular music. Musicae Scientae. 13, 2, 273--313.Google ScholarCross Ref
Clarke, E. F. and Krumhansl, C. L. 1990. Perceiving musical time. Music Perception. 7, 3, 213--251.Google ScholarCross Ref
Eckmann, J. P., Kamphorst, S. O., and Ruelle, D. 1987. Recurrence plots of dynamical systems. Europhysics Letters, 5, 9, 973--977.Google ScholarCross Ref
Ehmann, A. F., Bay, M., Downie, J. S., Fujinaga, I., and De Roure, D. 2011. Exploiting music structures for digital libraries. In Proceeding of the International ACM/IEEE Joint Conference on Digital Libraries (Ottawa, Canada). 479--480. Google ScholarDigital Library
Eronen, A. 2007. Chorus detection with combined use of MFCC and chroma features and image processing filters. In Proceedings of the International Conference on Digital Audio Effects (Bordeaux, France). 229--236.Google Scholar
Foote, J. 1999. Visualizing music and audio using self-similarity. In Proceedings of the ACM International Conference on Multimedia (New York, NY, USA). 77--80. Google ScholarDigital Library
Foote, J. 2000. Automatic Audio Segmentation using a Measure of Audio Novelty. In Proceedings of the IEEE International Conference on Multimedia and Expo. 452--455.Google ScholarCross Ref
Foote, J. and Cooper, M. 2003. Media segmentation using self-similarity decomposition. In Proceedings of the SPIE: Storage and Retrieval for Media Databases (Santa Clara, CA, USA). 167--175.Google Scholar
Frankland, B. and Cohen, A. 2004. Parsing of melody: Quantification and testing of the local grouping rules of Lerdahl and Jackendoff's A Generative Theory of Tonal Music. Music Perception. 21, 4, 499--543.Google ScholarCross Ref
Goto, M. 2006. A chorus section detection method for musical audio signals and its application to a music listening station. IEEE Transactions on Audio, Speech and Language Processing. 14, 5, 1783--1794. Google ScholarDigital Library
Grosche, P., Serrà, J., Müller, M., and Arcos, J. L. 2012. Structure-based audio fingerprinting for music retrieval. In Proceedings of the International Conference on Music Information Retrieval (Porto, Portugal). 55--60.Google Scholar
Hargreaves, S., Klapuri, A., and Sandler, M. 2012. Structural segmentation of multitrack audio. IEEE Transactions on Audio, Speech, and Language Processing. 20, 10, 2637--2647.Google ScholarDigital Library
Jehan, T. 2005. Hierarchical multi-class self similarities. In Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (New Paltz, NY, USA). 311--314.Google ScholarCross Ref
Kaiser, F. and Peeters, G. 2012. Adaptive temporal modeling of audio features in the context of music structure segmentation. In International Workshop on Adaptive Multimedia Retrieval (Copenhagen, Denmark).Google Scholar
Kaiser, F. and Sikora, T. 2010. Music structure discovery in popular music using non-negative matrix factorization. In Proceedings of the International Society for Music Information Retrieval Conference (Utrecht, The Netherlands). 429--434.Google Scholar
Landone, C. Gasser, M., Cannam, C., Harte, C., Davies, M., Noland, K., Wilmering, T., Xue, W., and Zhou, R. 2011. QM Vamp Plugins. Available: http://isophonics.net/QMVampPlugins, accessed 1 October 2012.Google Scholar
Marolt, M. 2006. A mid-level melody-based representation for calculating audio similarity. In Proceedings of the International Conference on Music Information Retrieval (Victoria, Canada). 280--285.Google Scholar
Mauch, M., Noland, K., and Dixon, S. 2009. Using musical structure to enhance automatic chord transcription. In Proceedings of the International Society for Music Information Retrieval Conference (Kobe, Japan). 231--236.Google Scholar
Müller, M. & Appelt, D. 2008. Path-constrained partial music synchronization. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (Las Vegas, NV, USA). 65--68.Google ScholarCross Ref
Ong, B. 2007. Structural Analysis and Segmentation of Music Signals. Ph.D. dissertation, University Pompeu Fabra.Google Scholar
Pampalk, E. 2004. A Matlab toolbox to compute similarity from audio. In Proceedings of the International Conference on Music Information Retrieval (Barcelona, Spain). 254--257.Google Scholar
Pampalk, E., Dixon, S., and Widmer, G. 2004. Exploring music collections by browsing different views. Computer Music Journal, 28, 2, 49--62. Google ScholarDigital Library
Pampalk, E., Rauber, A., and Merkl, D. 2002. Content-based organization and visualization of music archives. In Proceedings of the ACM International Conference on Multimedia (Juan les Pins, France). 570--579. Google ScholarDigital Library
Parry, R., and Essa, I. 2004. Feature weighting for segmentation. In Proceedings of the International Conference for Music Information Retrieval (Barcelona, Spain).Google Scholar
Paulus, J. & Klapuri, A. 2006. Music structure analysis by finding repeated parts. In Proceedings of the ACM Workshop on Audio and Music Computing Multimedia (New York, NY, USA). 59--68. Google ScholarDigital Library
Paulus, J., Müller, M., and Klapuri, A. 2010. Audio-based music structure analysis. In Proceedings of the International Society for Music Information Retrieval Conference (Utrecht, The Netherlands). 625--636.Google Scholar
Peeters, G. 2004. Deriving musical structures from signal analysis for music audio summary generation: "Sequence" and "State" approach. In Computer Music Modeling and Retrieval 2771. Springer Berlin / Heidelberg, 169--185.Google Scholar
Peeters, G. 2007. Sequence representation of music structure using higher-order similarity matrix and maximum-likelihood approach. In Proceedings of the International Conference on Music Information Retrieval (Vienna, Austria). 35--40.Google Scholar
Shiu, Y., Jeong, H., and Kuo, C.-C. J. 2006. Similarity matrix processing for music structure analysis. In Proceedings of the ACM Workshop on Audio and Music Computing Multimedia (New York, NY, USA). 69--76. Google ScholarDigital Library
Smith, J. B. L., Burgoyne, J. A., Fujinaga, I., De Roure, D., and Downie, J. S. 2011. Design and creation of a large-scale database of structural annotations. In Proceedings of the International Society for Music Information Retrieval Conference (Miami, FL, USA). 555--560.Google Scholar

Index Terms

Using quadratic programming to estimate feature relevance in structural analyses of music
1. Computing methodologies
  1. Modeling and simulation
    1. Model development and analysis
      1. Modeling methodologies

Recommendations

Multimodal content-based structure analysis of karaoke music
MULTIMEDIA '05: Proceedings of the 13th annual ACM international conference on Multimedia

This paper presents a novel approach for content-based analysis of karaoke music, which utilizes multimodal contents including synchronized lyrics text from the video channel and original singing audio as well as accompaniment audio in the two audio ...
Read More
Cognitive factors in generative music systems
AM '14: Proceedings of the 9th Audio Mostly: A Conference on Interaction With Sound

This research aims to inform the development of generative music algorithms with principles drawn from research into music perception and cognition. Research has provided insights into the ways humans mentally organise musical sound and resulted in ...
Read More
Structural Segmentation of Music Based on Repeated Harmonies
ISM '13: Proceedings of the 2013 IEEE International Symposium on Multimedia

In this paper we present a simple, yet powerful method for deriving the structural segmentation of a musical piece based on repetitions in chord sequences, called FORM. Repetition in harmony is a fundamental factor in constituting musical form. However, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '13: Proceedings of the 21st ACM international conference on Multimedia
October 2013
1166 pages
ISBN:9781450324045
DOI:10.1145/2502081
General Chairs:
Alejandro (Alex) Jaimes
Yahoo!, Spain
,
Nicu Sebe
University of Trento, Italy
,
Nozha Boujemaa
INRIA, France
,
Program Chairs:
Daniel Gatica-Perez
IDIAP & EPFL, Switzerland
,
David A. Shamma
Yahoo!, USA
,
Marcel Worring
University of Amsterdam, The Netherlands
,
Roger Zimmermann
National University of Singapore, Singapore
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 October 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
music cognition
music perception
music structure analysis
quadratic programming
repetition
Qualifiers
- research-article
Conference

Acceptance Rates
MM '13 Paper Acceptance Rate47of235submissions,20%Overall Acceptance Rate995of4,171submissions,24%
More
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 260
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Using quadratic programming to estimate feature relevance in structural analyses of music

MM '13: Proceedings of the 21st ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Multimodal content-based structure analysis of karaoke music

Cognitive factors in generative music systems

Structural Segmentation of Music Based on Repeated Harmonies