skip to main content
10.1145/1277741.1277817acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Towards musical query-by-semantic-description using the CAL500 data set

Published: 23 July 2007 Publication History

Abstract

Query-by-semantic-description (QBSD)is a natural paradigm for retrieving content from large databases of music. A major impediment to the development of good QBSD systems for music information retrieval has been the lack of a cleanly-labeled, publicly-available, heterogeneous data set of songs and associated annotations. We have collected the Computer Audition Lab 500-song (CAL500) data set by having humans listen to and annotate songs using a survey designed to capture 'semantic associations' between music and words. We adapt the supervised multi-class labeling (SML) model, which has shown good performance on the task of image retrieval, and use the CAL500 data to learn a model for music retrieval. The model parameters are estimated using the weighted mixture hierarchies expectation-maximization algorithm which has been specifically designed to handle real-valued semantic association between words and songs, rather than binary class labels. The output of the SML model, a vector of class-conditional probabilities, can be interpreted as a semantic multinomial distribution over a vocabulary. By also representing a semantic query as a query multinomial distribution, we can quickly rank order the songs in a database based on the Kullback-Leibler divergence between the query multinomial and each song's semantic multinomial. Qualitative and quantitative results demonstrate that our SML model can both annotate a novel song with meaningful words and retrieve relevant songs given a multi-word, text-based query.

References

[1]
G. Carneiro, A.B. Chan, P.J. Moreno, and N. Vasconcelos. Supervised learning of semantic classes for image annotation and retrieval. IEEE PAMI 29(3):394--410, 2007.
[2]
S.L. Feng, R. Manmatha, and Victor Lavrenko. Multiple bernoulli relevance models for image and video annotation. IEEE CVPR 2004.
[3]
D.M. Blei and M.I. Jordan. Modeling annotated data. ACM SIGIR 2003.
[4]
D. Forsyth and M. Fleck. Body plans. IEEE CVPR 1997.
[5]
International conferences of music information retrieval. http://www.ismir.net/.
[6]
MIREX 2005. Music information retrieval evaluation exchange. http://www.music-ir.org/mirex 2005.
[7]
M. Goto and K. Hirata. Recent studies on music information processing. Acoustical Science and Technology 25(4):419--425, 2004.
[8]
R.B. Dannenberg and N. Hu. Understanding search performance in query-by-humming systems. ISMIR 2004.
[9]
A. Kapur, M. Benning, and G. Tzanetakis. Query by beatboxing: Music information retrieval for the dj. ISMIR 2004.
[10]
G. Eisenberg, J.M. Batke, and T. Sikora. Beatbank - an mpeg-7 compliant query by tapping system. Audio Engineering Society Convention 2004.
[11]
M.F. McKinney and J. Breebaart. Features for audio and music classification. ISMIR 2003.
[12]
T. Li and G. Tzanetakis. Factors in automatic musical genre classification of audio signals. IEEE WASPAA 2003.
[13]
S. Essid, G. Richard, and B. David. Inferring efficient hierarchical taxonomies for music information retrieval tasks: Application to musical instruments. ISMIR 2005.
[14]
F. Pachet and D. Cazaly. A taxonomy of musical genres. RIAO 2000.
[15]
A. Berenzweig, B. Logan, D. Ellis, and B. Whitman. A large-scale evalutation of acoustic and subjective music-similarity measures. Computer Music Journal 2004.
[16]
B. Whitman. Learning the meaning of music PhD thesis, Massachusetts Institute of Technology, 2005.
[17]
B. Whitman and D. Ellis. Automatic record reviews. ISMIR 2004.
[18]
B. Whitman and R. Rifkin. Musical query-by-description as a multiclass learning problem. IEEE Workshop on Multimedia Signal Processing 2002.
[19]
M. Slaney. Semantic-audio retrieval. IEEE ICASSP 2002.
[20]
P. Cano and M. Koppenberger. Automatic sound annotation. In IEEE workshop on Machine Learning for Signal Processing 2004.
[21]
M. Slaney. Mixtures of probability experts for audio retrieval and indexing. IEEE Multimedia and Expo 2002.
[22]
T. Cover and J. Thomas. Elements of Information Theory Wiley-Interscience, 1991.
[23]
N. Vasconcelos. Image indexing with mixture hierarchies. IEEE CVPR pages 3--10, 2001.
[24]
D. Turnbull, L. Barrington, and G. Lanckriet. Modelling music and words using a multi-class naíve bayes approach. ISMIR 2006.
[25]
C. McKay, D. McEnnis, and I. Fujinaga. A large publicly accessible prototype audio database for music research. ISMIR 2006.
[26]
J. Skowronek, M. McKinney, and S. ven de Par. Ground-truth for automatic music mood classification. ISMIR 2006.
[27]
Xiao Hu, J.S. Downie, and A.F. Ehmann. Exploiting recommended usage metadata: Exploratory analyses. ISMIR 2006.
[28]
L. Rabiner and B.H. Juang. Fundamentals of Speech Recognition Prentice Hall, 1993.
[29]
L. Barrington, A. Chan, D. Turnbull, and G. Lanckriet. Audio information retrieval using semantic similarity. In IEEE ICASSP 2007.
[30]
D. Turnbull, R. Liu, L. Barrington, D. Torres,and G. Lanckriet. Using games to collect semantic information about music. Technical report, 2007.
[31]
L. von Ahn.Games with a purpose. IEEE Computer Magazine 39(6):92--94, 2006.

Cited By

View all
  • (2024)MuIm: Analyzing Music–Image Correlations from an Artistic PerspectiveApplied Sciences10.3390/app14231147014:23(11470)Online publication date: 9-Dec-2024
  • (2023)Music Deep Learning: Deep Learning Methods for Music Signal Processing—A Review of the State-of-the-ArtIEEE Access10.1109/ACCESS.2023.324462011(17031-17052)Online publication date: 2023
  • (2023)Image Is All for Music Retrieval: Interactive Music Retrieval System Using Images with Mood and Theme AttributesInternational Journal of Human–Computer Interaction10.1080/10447318.2023.220155740:14(3841-3855)Online publication date: 24-Apr-2023
  • Show More Cited By

Index Terms

  1. Towards musical query-by-semantic-description using the CAL500 data set

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
      July 2007
      946 pages
      ISBN:9781595935977
      DOI:10.1145/1277741
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 23 July 2007

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. content-based music information retrieval
      2. query-by-semantic-description
      3. supervised multi-class classification

      Qualifiers

      • Article

      Conference

      SIGIR07
      Sponsor:
      SIGIR07: The 30th Annual International SIGIR Conference
      July 23 - 27, 2007
      Amsterdam, The Netherlands

      Acceptance Rates

      Overall Acceptance Rate 792 of 3,983 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)19
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 03 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)MuIm: Analyzing Music–Image Correlations from an Artistic PerspectiveApplied Sciences10.3390/app14231147014:23(11470)Online publication date: 9-Dec-2024
      • (2023)Music Deep Learning: Deep Learning Methods for Music Signal Processing—A Review of the State-of-the-ArtIEEE Access10.1109/ACCESS.2023.324462011(17031-17052)Online publication date: 2023
      • (2023)Image Is All for Music Retrieval: Interactive Music Retrieval System Using Images with Mood and Theme AttributesInternational Journal of Human–Computer Interaction10.1080/10447318.2023.220155740:14(3841-3855)Online publication date: 24-Apr-2023
      • (2022)MERP: A Music Dataset with Emotion Ratings and Raters’ Profile InformationSensors10.3390/s2301038223:1(382)Online publication date: 29-Dec-2022
      • (2022)Music Emotion Recognition Based on Bilayer Feature ExtractionWireless Communications & Mobile Computing10.1155/2022/78325482022Online publication date: 1-Jan-2022
      • (2022)Deep Learning Aided Emotion Recognition from Music2022 International Conference on Automation, Computing and Renewable Systems (ICACRS)10.1109/ICACRS55517.2022.10029108(712-716)Online publication date: 13-Dec-2022
      • (2022)On the application of deep learning and multifractal techniques to classify emotions and instruments using Indian Classical MusicPhysica A: Statistical Mechanics and its Applications10.1016/j.physa.2022.127261597(127261)Online publication date: Jul-2022
      • (2022)A survey of music emotion recognitionFrontiers of Computer Science10.1007/s11704-021-0569-416:6Online publication date: 22-Jan-2022
      • (2021)Deep-Learning-Based Multimodal Emotion Classification for Music VideosSensors10.3390/s2114492721:14(4927)Online publication date: 20-Jul-2021
      • (2020)Missing Value Imputation for Mixed Data via Gaussian CopulaProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3394486.3403106(636-646)Online publication date: 23-Aug-2020
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media