skip to main content
10.1145/2502081.2502124acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Using quadratic programming to estimate feature relevance in structural analyses of music

Published:21 October 2013Publication History

ABSTRACT

To identify repeated patterns and contrasting sections in music, it is common to use self-similarity matrices (SSMs) to visualize and estimate structure. We introduce a novel application for SSMs derived from audio recordings: using them to learn about the potential reasoning behind a listener's annotation. We use SSMs generated by musically-motivated audio features at various timescales to represent contributions to a structural annotation. Since a listener's attention can shift among musical features (e.g., rhythm, timbre, and harmony) throughout a piece, we further break down the SSMs into section-wise components and use quadratic programming (QP) to minimize the distance between a linear sum of these components and the annotated description. We posit that the optimal section-wise weights on the feature components may indicate the features to which a listener attended when annotating a piece, and thus may help us to understand why two listeners disagreed about a piece's structure. We discuss some examples that substantiate the claim that feature relevance varies throughout a piece, using our method to investigate differences between listeners' interpretations, and lastly propose some variations on our method.

References

  1. Bartsch, M. and Wakefield, G. 200 To catch a chorus: using chroma-based representations for audio thumbnailing. IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (New Paltz, NY, USA). 15--18.Google ScholarGoogle Scholar
  2. Bruderer, M., McKinney, M., and Kohlrausch, A. 2009. The perception of structural boundaries in melody lines of Western popular music. Musicae Scientae. 13, 2, 273--313.Google ScholarGoogle ScholarCross RefCross Ref
  3. Clarke, E. F. and Krumhansl, C. L. 1990. Perceiving musical time. Music Perception. 7, 3, 213--251.Google ScholarGoogle ScholarCross RefCross Ref
  4. Eckmann, J. P., Kamphorst, S. O., and Ruelle, D. 1987. Recurrence plots of dynamical systems. Europhysics Letters, 5, 9, 973--977.Google ScholarGoogle ScholarCross RefCross Ref
  5. Ehmann, A. F., Bay, M., Downie, J. S., Fujinaga, I., and De Roure, D. 2011. Exploiting music structures for digital libraries. In Proceeding of the International ACM/IEEE Joint Conference on Digital Libraries (Ottawa, Canada). 479--480. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Eronen, A. 2007. Chorus detection with combined use of MFCC and chroma features and image processing filters. In Proceedings of the International Conference on Digital Audio Effects (Bordeaux, France). 229--236.Google ScholarGoogle Scholar
  7. Foote, J. 1999. Visualizing music and audio using self-similarity. In Proceedings of the ACM International Conference on Multimedia (New York, NY, USA). 77--80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Foote, J. 2000. Automatic Audio Segmentation using a Measure of Audio Novelty. In Proceedings of the IEEE International Conference on Multimedia and Expo. 452--455.Google ScholarGoogle ScholarCross RefCross Ref
  9. Foote, J. and Cooper, M. 2003. Media segmentation using self-similarity decomposition. In Proceedings of the SPIE: Storage and Retrieval for Media Databases (Santa Clara, CA, USA). 167--175.Google ScholarGoogle Scholar
  10. Frankland, B. and Cohen, A. 2004. Parsing of melody: Quantification and testing of the local grouping rules of Lerdahl and Jackendoff's A Generative Theory of Tonal Music. Music Perception. 21, 4, 499--543.Google ScholarGoogle ScholarCross RefCross Ref
  11. Goto, M. 2006. A chorus section detection method for musical audio signals and its application to a music listening station. IEEE Transactions on Audio, Speech and Language Processing. 14, 5, 1783--1794. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Grosche, P., Serrà, J., Müller, M., and Arcos, J. L. 2012. Structure-based audio fingerprinting for music retrieval. In Proceedings of the International Conference on Music Information Retrieval (Porto, Portugal). 55--60.Google ScholarGoogle Scholar
  13. Hargreaves, S., Klapuri, A., and Sandler, M. 2012. Structural segmentation of multitrack audio. IEEE Transactions on Audio, Speech, and Language Processing. 20, 10, 2637--2647.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Jehan, T. 2005. Hierarchical multi-class self similarities. In Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (New Paltz, NY, USA). 311--314.Google ScholarGoogle ScholarCross RefCross Ref
  15. Kaiser, F. and Peeters, G. 2012. Adaptive temporal modeling of audio features in the context of music structure segmentation. In International Workshop on Adaptive Multimedia Retrieval (Copenhagen, Denmark).Google ScholarGoogle Scholar
  16. Kaiser, F. and Sikora, T. 2010. Music structure discovery in popular music using non-negative matrix factorization. In Proceedings of the International Society for Music Information Retrieval Conference (Utrecht, The Netherlands). 429--434.Google ScholarGoogle Scholar
  17. Landone, C. Gasser, M., Cannam, C., Harte, C., Davies, M., Noland, K., Wilmering, T., Xue, W., and Zhou, R. 2011. QM Vamp Plugins. Available: http://isophonics.net/QMVampPlugins, accessed 1 October 2012.Google ScholarGoogle Scholar
  18. Marolt, M. 2006. A mid-level melody-based representation for calculating audio similarity. In Proceedings of the International Conference on Music Information Retrieval (Victoria, Canada). 280--285.Google ScholarGoogle Scholar
  19. Mauch, M., Noland, K., and Dixon, S. 2009. Using musical structure to enhance automatic chord transcription. In Proceedings of the International Society for Music Information Retrieval Conference (Kobe, Japan). 231--236.Google ScholarGoogle Scholar
  20. Müller, M. & Appelt, D. 2008. Path-constrained partial music synchronization. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (Las Vegas, NV, USA). 65--68.Google ScholarGoogle ScholarCross RefCross Ref
  21. Ong, B. 2007. Structural Analysis and Segmentation of Music Signals. Ph.D. dissertation, University Pompeu Fabra.Google ScholarGoogle Scholar
  22. Pampalk, E. 2004. A Matlab toolbox to compute similarity from audio. In Proceedings of the International Conference on Music Information Retrieval (Barcelona, Spain). 254--257.Google ScholarGoogle Scholar
  23. Pampalk, E., Dixon, S., and Widmer, G. 2004. Exploring music collections by browsing different views. Computer Music Journal, 28, 2, 49--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Pampalk, E., Rauber, A., and Merkl, D. 2002. Content-based organization and visualization of music archives. In Proceedings of the ACM International Conference on Multimedia (Juan les Pins, France). 570--579. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Parry, R., and Essa, I. 2004. Feature weighting for segmentation. In Proceedings of the International Conference for Music Information Retrieval (Barcelona, Spain).Google ScholarGoogle Scholar
  26. Paulus, J. & Klapuri, A. 2006. Music structure analysis by finding repeated parts. In Proceedings of the ACM Workshop on Audio and Music Computing Multimedia (New York, NY, USA). 59--68. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Paulus, J., Müller, M., and Klapuri, A. 2010. Audio-based music structure analysis. In Proceedings of the International Society for Music Information Retrieval Conference (Utrecht, The Netherlands). 625--636.Google ScholarGoogle Scholar
  28. Peeters, G. 2004. Deriving musical structures from signal analysis for music audio summary generation: "Sequence" and "State" approach. In Computer Music Modeling and Retrieval 2771. Springer Berlin / Heidelberg, 169--185.Google ScholarGoogle Scholar
  29. Peeters, G. 2007. Sequence representation of music structure using higher-order similarity matrix and maximum-likelihood approach. In Proceedings of the International Conference on Music Information Retrieval (Vienna, Austria). 35--40.Google ScholarGoogle Scholar
  30. Shiu, Y., Jeong, H., and Kuo, C.-C. J. 2006. Similarity matrix processing for music structure analysis. In Proceedings of the ACM Workshop on Audio and Music Computing Multimedia (New York, NY, USA). 69--76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Smith, J. B. L., Burgoyne, J. A., Fujinaga, I., De Roure, D., and Downie, J. S. 2011. Design and creation of a large-scale database of structural annotations. In Proceedings of the International Society for Music Information Retrieval Conference (Miami, FL, USA). 555--560.Google ScholarGoogle Scholar

Index Terms

  1. Using quadratic programming to estimate feature relevance in structural analyses of music

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MM '13: Proceedings of the 21st ACM international conference on Multimedia
      October 2013
      1166 pages
      ISBN:9781450324045
      DOI:10.1145/2502081

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 October 2013

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      MM '13 Paper Acceptance Rate47of235submissions,20%Overall Acceptance Rate995of4,171submissions,24%

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader