research-article

SRbench--a benchmark for soundtrack recommendation systems

Authors:
Aleksandar Stupar

Saarland University, Saarbruecken, Germany

Saarland University, Saarbruecken, Germany
View Profile

,
Sebastian Michel

Saarland University, Saarbruecken, Germany

Saarland University, Saarbruecken, Germany
View Profile

CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge ManagementOctober 2013Pages 2285–2290https://doi.org/10.1145/2505515.2505658

Published:27 October 2013Publication History

CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

Pages 2285–2290

ABSTRACT

In this work, a benchmark to evaluate the retrieval performance of soundtrack recommendation systems is proposed. Such systems aim at finding songs that are played as background music for a given set of images. The proposed benchmark is based on preference judgments, where relevance is considered a continuous ordinal variable and judgments are collected for pairs of songs with respect to a query (i.e., set of images). To capture a wide variety of songs and images, we use a large space of possible music genres, different emotions expressed through music, and various query-image themes. The benchmark consists of two types of relevance assessments: (i) judgments obtained from a user study, that serve as a ``gold standard'' for (ii) relevance judgments gathered through Amazon's Mechanical Turk. We report on the performance of two state-of-the-art soundtrack recommendation systems using the proposed benchmark.

References

O. Alonso and R. A. Baeza-Yates. Design and implementation of relevance assessments using crowdsourcing. ECIR, 2011. Google ScholarDigital Library
O. Alonso, R. Schenkel, and M. Theobald. Crowdsourcing assessments for XML ranked retrieval. ECIR, 2010. Google ScholarDigital Library
J. Arguello, F. Diaz, J. Callan, and B. Carterette. A methodology for evaluating aggregated search results. ECIR, 2011. Google ScholarDigital Library
B. Carterette and P. N. Bennett. Evaluation measures for preference judgments. SIGIR, 2008. Google ScholarDigital Library
B. Carterette, P. N. Bennett, D. M. Chickering, and S. T. Dumais. Here or there: preference judgments for relevance. ECIR, 2008. Google ScholarDigital Library
B. Carterette and D. Petkova. Learning a ranking from pairwise preferences. SIGIR, 2006. Google ScholarDigital Library
G. Fechner. Elemente der Psychophysik. Breitkopf und Haertel, 1860.Google Scholar
Psychpage - General list of feelings. http://www.psychpage.com/learning/library/assess/feelings.html.Google Scholar
Wikipedia - List of music genres. http://en.wikipedia.org/wiki/List_of_popular_music_genres.Google Scholar
ImageCLEF - Image Retrieval in CLEF. http://www.imageclef.org/.Google Scholar
Wikipedia - List of photograpy forms. http://en.wikipedia.org/wiki/Photography.Google Scholar
R. Janicki. Ranking with partial orders and pairwise comparisons. RSKT, 2008. Google ScholarDigital Library
K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of ir techniques. ACM Trans. Inf. Syst., 2002. Google ScholarDigital Library
G. Kazai, N. Milic-Frayling, and J. Costello. Towards methods for the collective gathering and quality control of relevance assessments. SIGIR, 2009. Google ScholarDigital Library
P. J. Lang, M. M. Bradley, and B. N. Cuthbert. International affective picture system (iaps): Affective ratings of pictures and instruction manual. Technical report, University of Florida, 2008.Google Scholar
Last.Fm - Music portal. http://www.last.fm/.Google Scholar
C.-T. Li and M.-K. Shan. Emotion-based impressionism slideshow with automatic music accompaniment. ACM Multimedia, 2007. Google ScholarDigital Library
W. A. Mason and D. J. Watts. Financial incentives and the "performance of crowds". KDD Workshop on Human Computation, 2009. Google ScholarDigital Library
MIREX - The Music Information Retrieval Evaluation eXchange. http://www.music-ir.org/mirex/wiki/MIREX_HOME.Google Scholar
Amazon Mechanical Turk. https://www.mturk.com/mturk/welcome.Google Scholar
Picasa - Photo sharing portal. https://picasaweb.google.com/.Google Scholar
M. E. Rorvig. The simple scalability of documents. JASIS, 1990.Google ScholarCross Ref
J. A. Russell. A circumplex model of affect. Journal of personality and social psychology, 1980.Google Scholar
M. Sanderson, M. L. Paramita, P. Clough, and E. Kanoulas. Do user preferences and evaluation measures line up? SIGIR, 2010. Google ScholarDigital Library
R. Snow, B. O'Connor, D. Jurafsky, and A. Y. Ng. Cheap and fast - but is it good? evaluating non-expert annotations for natural language tasks. EMNLP, 2008. Google ScholarDigital Library
A. Stupar and S. Michel. Picasso - to sing, you must close your eyes and draw. SIGIR, 2011. Google ScholarDigital Library
A. Stupar and S. Michel. Benchmarking Soundtrack Recommendation Systems with SRBench. CoRR, abs/1308.1224, 2013. Google ScholarDigital Library
P. Thomas and D. Hawking. Evaluation by comparing result sets in context. CIKM, 2006. Google ScholarDigital Library
L. Thurstone. A law of comparative judgments. Psychological Review, 1927.Google ScholarCross Ref
TREC - Text REtrieval Conference. http://trec.nist.gov/.Google Scholar
TRECVID - TREC Video Retrieval Evaluation. http://trecvid.nist.gov/.Google Scholar
R. Typke, M. den Hoed, J. de Nooijer, F. Wiering, and R. C. Veltkamp. A ground truth for half a million musical incipits. JDIM, 2005.Google Scholar
R. Typke, R. C. Veltkamp, and F. Wiering. A measure for evaluating retrieval techniques based on partially ordered ground truth lists. ICME, 2006.Google ScholarCross Ref
J. Urbano, M. Marrero, D. Martín, and J. Lloréns. Improving the generation of ground truths based on partially ordered lists. ISMIR, 2010.Google Scholar

Index Terms

SRbench--a benchmark for soundtrack recommendation systems
1. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results

Recommendations

MUSIB: musical score inpainting benchmark
Abstract
Music inpainting is a sub-task of automated music generation that aims to infill incomplete musical pieces to help musicians in their musical composition process. Many methods have been developed for this task. However, we observe a tendency for ...
Read More
SPEC MPI2007—an application benchmark suite for parallel systems using MPI
International Supercomputing Conference (ISC07)

The SPEC High-Performance Group has developed the benchmark suite SPEC MPI2007 and its run rules over the last few years. The purpose of the SPEC MPI2007 benchmark and its run rules is to further the cause of fair and objective benchmarking of high-...
Read More
Overview of TPC Benchmark E: The Next Generation of OLTP Benchmarks
Performance Evaluation and Benchmarking

Set to replace the aging TPC-C, the TPC Benchmark E is the next generation OLTP benchmark, which more accurately models client database usage. TPC-E addresses the shortcomings of TPC-C. It has a much more complex workload, requires the use of RAID-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management
October 2013
2612 pages
ISBN:9781450322638
DOI:10.1145/2505515
General Chairs:
Qi He
LinkedIn, USA
,
Arun Iyengar
IBM T.J. Watson Research Center, USA
,
Program Chairs:
Wolfgang Nejdl
L3S Research Center, Germany
,
Jian Pei
Simon Fraser University, Canada
,
Rajeev Rastogi
Amazon, India
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 October 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
benchmark
evaluation
soundtrack recommendation
Qualifiers
- research-article
Conference

Acceptance Rates
CIKM '13 Paper Acceptance Rate143of848submissions,17%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 138
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

SRbench--a benchmark for soundtrack recommendation systems

CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

MUSIB: musical score inpainting benchmark

SPEC MPI2007—an application benchmark suite for parallel systems using MPI

Overview of TPC Benchmark E: The Next Generation of OLTP Benchmarks