Article

Towards musical query-by-semantic-description using the CAL500 data set

Authors:

Douglas Turnbull,

Luke Barrington,

Gert LanckrietAuthors Info & Claims

SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

Pages 439 - 446

https://doi.org/10.1145/1277741.1277817

Published: 23 July 2007 Publication History

Abstract

Query-by-semantic-description (QBSD)is a natural paradigm for retrieving content from large databases of music. A major impediment to the development of good QBSD systems for music information retrieval has been the lack of a cleanly-labeled, publicly-available, heterogeneous data set of songs and associated annotations. We have collected the Computer Audition Lab 500-song (CAL500) data set by having humans listen to and annotate songs using a survey designed to capture 'semantic associations' between music and words. We adapt the supervised multi-class labeling (SML) model, which has shown good performance on the task of image retrieval, and use the CAL500 data to learn a model for music retrieval. The model parameters are estimated using the weighted mixture hierarchies expectation-maximization algorithm which has been specifically designed to handle real-valued semantic association between words and songs, rather than binary class labels. The output of the SML model, a vector of class-conditional probabilities, can be interpreted as a semantic multinomial distribution over a vocabulary. By also representing a semantic query as a query multinomial distribution, we can quickly rank order the songs in a database based on the Kullback-Leibler divergence between the query multinomial and each song's semantic multinomial. Qualitative and quantitative results demonstrate that our SML model can both annotate a novel song with meaningful words and retrieve relevant songs given a multi-word, text-based query.

References

[1]

G. Carneiro, A.B. Chan, P.J. Moreno, and N. Vasconcelos. Supervised learning of semantic classes for image annotation and retrieval. IEEE PAMI 29(3):394--410, 2007.

Digital Library

[2]

S.L. Feng, R. Manmatha, and Victor Lavrenko. Multiple bernoulli relevance models for image and video annotation. IEEE CVPR 2004.

Digital Library

[3]

D.M. Blei and M.I. Jordan. Modeling annotated data. ACM SIGIR 2003.

Digital Library

[4]

D. Forsyth and M. Fleck. Body plans. IEEE CVPR 1997.

Digital Library

[5]

International conferences of music information retrieval. http://www.ismir.net/.

[6]

MIREX 2005. Music information retrieval evaluation exchange. http://www.music-ir.org/mirex 2005.

[7]

M. Goto and K. Hirata. Recent studies on music information processing. Acoustical Science and Technology 25(4):419--425, 2004.

[8]

R.B. Dannenberg and N. Hu. Understanding search performance in query-by-humming systems. ISMIR 2004.

[9]

A. Kapur, M. Benning, and G. Tzanetakis. Query by beatboxing: Music information retrieval for the dj. ISMIR 2004.

[10]

G. Eisenberg, J.M. Batke, and T. Sikora. Beatbank - an mpeg-7 compliant query by tapping system. Audio Engineering Society Convention 2004.

[11]

M.F. McKinney and J. Breebaart. Features for audio and music classification. ISMIR 2003.

[12]

T. Li and G. Tzanetakis. Factors in automatic musical genre classification of audio signals. IEEE WASPAA 2003.

[13]

S. Essid, G. Richard, and B. David. Inferring efficient hierarchical taxonomies for music information retrieval tasks: Application to musical instruments. ISMIR 2005.

[14]

F. Pachet and D. Cazaly. A taxonomy of musical genres. RIAO 2000.

[15]

A. Berenzweig, B. Logan, D. Ellis, and B. Whitman. A large-scale evalutation of acoustic and subjective music-similarity measures. Computer Music Journal 2004.

Digital Library

[16]

B. Whitman. Learning the meaning of music PhD thesis, Massachusetts Institute of Technology, 2005.

Digital Library

[17]

B. Whitman and D. Ellis. Automatic record reviews. ISMIR 2004.

[18]

B. Whitman and R. Rifkin. Musical query-by-description as a multiclass learning problem. IEEE Workshop on Multimedia Signal Processing 2002.

[19]

M. Slaney. Semantic-audio retrieval. IEEE ICASSP 2002.

[20]

P. Cano and M. Koppenberger. Automatic sound annotation. In IEEE workshop on Machine Learning for Signal Processing 2004.

[21]

M. Slaney. Mixtures of probability experts for audio retrieval and indexing. IEEE Multimedia and Expo 2002.

[22]

T. Cover and J. Thomas. Elements of Information Theory Wiley-Interscience, 1991.

Digital Library

[23]

N. Vasconcelos. Image indexing with mixture hierarchies. IEEE CVPR pages 3--10, 2001.

[24]

D. Turnbull, L. Barrington, and G. Lanckriet. Modelling music and words using a multi-class naíve bayes approach. ISMIR 2006.

[25]

C. McKay, D. McEnnis, and I. Fujinaga. A large publicly accessible prototype audio database for music research. ISMIR 2006.

[26]

J. Skowronek, M. McKinney, and S. ven de Par. Ground-truth for automatic music mood classification. ISMIR 2006.

[27]

Xiao Hu, J.S. Downie, and A.F. Ehmann. Exploiting recommended usage metadata: Exploratory analyses. ISMIR 2006.

[28]

L. Rabiner and B.H. Juang. Fundamentals of Speech Recognition Prentice Hall, 1993.

Digital Library

[29]

L. Barrington, A. Chan, D. Turnbull, and G. Lanckriet. Audio information retrieval using semantic similarity. In IEEE ICASSP 2007.

[30]

D. Turnbull, R. Liu, L. Barrington, D. Torres,and G. Lanckriet. Using games to collect semantic information about music. Technical report, 2007.

[31]

L. von Ahn.Games with a purpose. IEEE Computer Magazine 39(6):92--94, 2006.

Digital Library

Cited By

Ullah UChoi H(2024)MuIm: Analyzing Music–Image Correlations from an Artistic PerspectiveApplied Sciences10.3390/app14231147014:23(11470)Online publication date: 9-Dec-2024
https://doi.org/10.3390/app142311470
Moysis LIliadis LSotiroudis SBoursianis APapadopoulou MKokkinidis KVolos CSarigiannidis PNikolaidis SGoudos S(2023)Music Deep Learning: Deep Learning Methods for Music Signal Processing—A Review of the State-of-the-ArtIEEE Access10.1109/ACCESS.2023.324462011(17031-17052)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3244620
Park JKim MKim H(2023)Image Is All for Music Retrieval: Interactive Music Retrieval System Using Images with Mood and Theme AttributesInternational Journal of Human–Computer Interaction10.1080/10447318.2023.220155740:14(3841-3855)Online publication date: 24-Apr-2023
https://doi.org/10.1080/10447318.2023.2201557
Show More Cited By

Index Terms

Towards musical query-by-semantic-description using the CAL500 data set
1. Applied computing
  1. Arts and humanities
2. Computing methodologies
  1. Artificial intelligence

Recommendations

Music Genre Classification and Similarity Calculation Using Bass-Line Features
ISM '08: Proceedings of the 2008 Tenth IEEE International Symposium on Multimedia

Various studies on music information retrieval have been conducted. Most of them used low-level features such as spectral and cepstral features, which are effective to some extent but have a limit because they do not directly or clearly correspond to ...
A repeating pattern based Query-by-Humming fuzzy system for polyphonic melody retrieval

HighlightsDevising a bar-indexing method to reduce the processing time.Providing an effective strategy to deal with MIDI files.Using repeating pattern as a search index not only saves the database storage space and computing cost, but also raises the ...
N-gram inverted index structures on music data for theme mining and content-based information retrieval

Content-based music information retrieval and theme mining are two key problems in digital music information systems, where ''themes'' mean the longest-repeating patterns in a piece of music. However, most data structures constructed for retrieving ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

July 2007

946 pages

ISBN:9781595935977

DOI:10.1145/1277741

General Chairs:
Wessel Kraaij
TNO, The Netherlands
,
Arjen P. de Vries
CWI, The Netherlands
,
Program Chairs:
Charles L. A. Clarke
University of Waterloo, Canada
,
Norbert Fuhr
University of Duisburg-Essen, Germany
,
Noriko Kando
National Institute of Informatics, Japan

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 July 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

SIGIR07

Sponsor:

SIGIR07: The 30th Annual International SIGIR Conference

July 23 - 27, 2007

Amsterdam, The Netherlands

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

80
Total Citations
View Citations
893
Total Downloads

Downloads (Last 12 months)19
Downloads (Last 6 weeks)1

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ullah UChoi H(2024)MuIm: Analyzing Music–Image Correlations from an Artistic PerspectiveApplied Sciences10.3390/app14231147014:23(11470)Online publication date: 9-Dec-2024
https://doi.org/10.3390/app142311470
Moysis LIliadis LSotiroudis SBoursianis APapadopoulou MKokkinidis KVolos CSarigiannidis PNikolaidis SGoudos S(2023)Music Deep Learning: Deep Learning Methods for Music Signal Processing—A Review of the State-of-the-ArtIEEE Access10.1109/ACCESS.2023.324462011(17031-17052)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3244620
Park JKim MKim H(2023)Image Is All for Music Retrieval: Interactive Music Retrieval System Using Images with Mood and Theme AttributesInternational Journal of Human–Computer Interaction10.1080/10447318.2023.220155740:14(3841-3855)Online publication date: 24-Apr-2023
https://doi.org/10.1080/10447318.2023.2201557
Koh ECheuk KHeung KAgres KHerremans D(2022)MERP: A Music Dataset with Emotion Ratings and Raters’ Profile InformationSensors10.3390/s2301038223:1(382)Online publication date: 29-Dec-2022
https://doi.org/10.3390/s23010382
Wang CZhao Y(2022)Music Emotion Recognition Based on Bilayer Feature ExtractionWireless Communications & Mobile Computing10.1155/2022/78325482022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/7832548
Subramanian RRam KSai DReddy KChowdary KReddy K(2022)Deep Learning Aided Emotion Recognition from Music2022 International Conference on Automation, Computing and Renewable Systems (ICACRS)10.1109/ICACRS55517.2022.10029108(712-716)Online publication date: 13-Dec-2022
https://doi.org/10.1109/ICACRS55517.2022.10029108
Nag SBasu MSanyal SBanerjee AGhosh D(2022)On the application of deep learning and multifractal techniques to classify emotions and instruments using Indian Classical MusicPhysica A: Statistical Mechanics and its Applications10.1016/j.physa.2022.127261597(127261)Online publication date: Jul-2022
https://doi.org/10.1016/j.physa.2022.127261
Han DKong YHan JWang G(2022)A survey of music emotion recognitionFrontiers of Computer Science10.1007/s11704-021-0569-416:6Online publication date: 22-Jan-2022
https://doi.org/10.1007/s11704-021-0569-4
Pandeya YBhattarai BLee J(2021)Deep-Learning-Based Multimodal Emotion Classification for Music VideosSensors10.3390/s2114492721:14(4927)Online publication date: 20-Jul-2021
https://doi.org/10.3390/s21144927
Zhao YUdell MGupta RLiu YShah MRajan STang JPrakash B(2020)Missing Value Imputation for Mixed Data via Gaussian CopulaProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3394486.3403106(636-646)Online publication date: 23-Aug-2020
https://dl.acm.org/doi/10.1145/3394486.3403106
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten