skip to main content
research-article

Exploring the music similarity space on the web

Published: 22 July 2011 Publication History

Abstract

This article comprehensively addresses the problem of similarity measurement between music artists via text-based features extracted from Web pages. To this end, we present a thorough evaluation of different term-weighting strategies, normalization methods, aggregation functions, and similarity measurement techniques. In large-scale genre classification experiments carried out on real-world artist collections, we analyze several thousand combinations of settings/parameters that influence the similarity calculation process, and investigate in which way they impact the quality of the similarity estimates. Accurate similarity measures for music are vital for many applications, such as automated playlist generation, music recommender systems, music information systems, or intelligent user interfaces to access music collections by means beyond text-based browsing. Therefore, by exhaustively analyzing the potential of text-based features derived from artist-related Web pages, this article constitutes an important contribution to context-based music information research.

References

[1]
Ahn, L. V. and Dabbish, L. 2004. Labeling images with a computer game. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI).
[2]
Aucouturier, J.-J. and Pachet, F. 2002. Scaling up music playlist generation. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME'02). 105--108.
[3]
Aucouturier, J.-J. and Pachet, F. 2004. Improving timbre similarity: How high is the sky? J. Neg. Results Speech Audio Sci. 1, 1.
[4]
Baccigalupo, C., Plaza, E., and Donaldson, J. 2008. Uncovering affinity of artists to multiple genres from social behaviour data. In Proceedings of the 9th International Conference on Music Information Retrieval (ISMIR'08).
[5]
Baeza-Yates, R. and Ribeiro-Neto, B. 1999. Modern Information Retrieval. Addison Wesley.
[6]
Baumann, S. and Hummel, O. 2003. Using cultural metadata for artist recommendation. In Proceedings of the Conference on Web Delivering of Music (WEDELMUSIC'02).
[7]
Berenzweig, A., Logan, B., Ellis, D. P., and Whitman, B. 2003. A large-scale evaluation of acoustic and subjective music similarity measures. In Proceedings of the 4th International Conference on Music Information Retrieval (ISMIR'03).
[8]
Buckley, C. and Voorhees, E. 2000. Evaluating evaluation measure stability. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.
[9]
Casey, M. A., Veltkamp, R., Goto, M., Leman, M., Rhodes, C., and Slaney, M. 2008. Content-based music information retrieval: Current directions and future challenges. Proc. IEEE 96, 668--696.
[10]
Celma, O., Cano, P., and Herrera, P. 2006. Search sounds: An audio crawler focused on weblogs. In Proceedings of the 7th International Conference on Music Information Retrieval (ISMIR'06).
[11]
Celma, O. and Lamere, P. 2007. ISMIR 2007 Tutorial: Music recommendation. http://mtg.upf.edu/~ocelma/MusicRecommendationTutorial-ISMIR2007 (last accessed 12/07).
[12]
Chakrabarti, S., van den Berg, M., and Dom, B. 1999. Focused crawling: A new approach to topic-specific web resource discovery. Comput. Netw. 31, 11--16, 1623--1640.
[13]
Cimiano, P., Handschuh, S., and Staab, S. 2004. Towards the self-annotating Web. In Proceedings of the 13th International Conference on World Wide Web (WWW'04). ACM Press, New York, NY, 462--471.
[14]
Cimiano, P. and Staab, S. 2004. Learning by Googling. ACM SIGKDD Explor. Newsle. 6, 2, 24--33.
[15]
Cohen, W. W. and Fan, W. 2000a. Web-collaborative filtering: Recommending music by crawling the Web. Comput. Netw. 33, 1--6, 685--698.
[16]
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., and Harshman, R. 1990. Indexing by latent semantic analysis. J. Am. Soc. Inform. Sci. 41, 391--407.
[17]
Downie, J. S. 2003. Toward the scientific evaluation of music information retrieval systems. In Proceedings of the 4th International Symposium on Music Information Retrieval (ISMIR'03).
[18]
Ellis, D. P. W. 2002. The quest for ground truth in musical artist similarity. In Proceedings of the 3rd International Conference on Music Information Retrieval (ISMIR'02).
[19]
Fingerhut, M. 2004. Music information retrieval, or how to search for (and maybe find) music and do away with incipits. Slides for IAML/IASA Congress.
[20]
Geleijnse, G. and Korst, J. 2006. Web-based artist categorization. In Proceedings of the 7th International Conference on Music Information Retrieval (ISMIR'06).
[21]
Göker, A. and Myrhaug, H. I. 2002. User context and personalisation. In Proceedings of the 6th European Conference on Case Based Reasoning (ECCBR'02) (Workshop on Case Based Reasoning and Personalization).
[22]
Govaerts, S. and Duval, E. 2009. A Web-based approach to determine the origin of an artist. In Proceedings of the 10th International Society for Music Information Retrieval Conference (ISMIR'09).
[23]
Hofmann, T. 1999. Probabilistic latent semantic analysis. In Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI).
[24]
Hu, X., Downie, J. S., and Ehmann, A. F. 2009. Lyric text mining in music mood classification. In Proceedings of the 10th International Society for Music Information Retrieval Conference (ISMIR'09).
[25]
Hu, X., Downie, J. S., West, K., and Ehmann, A. 2005. Mining music reviews: Promising preliminary results. In Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR'05).
[26]
Kassler, M. 1966. Musical information retrieval. Perspect. New Music 4, 59--67.
[27]
Knees, P., Pampalk, E., and Widmer, G. 2004. Artist classification with Web-based data. In Proceedings of the 5th International Symposium on Music Information Retrieval (ISMIR'04). 517--524.
[28]
Knees, P., Pohle, T., Schedl, M., and Widmer, G. 2007. A music search engine built upon audio-based and web-based similarity measures. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'07).
[29]
Knees, P., Schedl, M., and Pohle, T. 2008. A deeper look into Web-based classification of music artists. In Proceedings of the 2nd Workshop on Learning the Semantics of Audio Signals (LSAS'08).
[30]
Knees, P., Schedl, M., Pohle, T., and Widmer, G. 2007. Exploring music collections in virtual landscapes. IEEE MultiMed. 14, 3, 46--54.
[31]
Laurier, C., Grivolla, J., and Herrera, P. 2008. Multimodal music mood classification using audio and lyrics. In Proceedings of the International Conference on Machine Learning and Applications.
[32]
Law, E. L. M., von Ahn, L., Dannenberg, R. B., and Crawford, M. 2007. Tagatune: A game for music and sound annotation. In Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR'07).
[33]
Logan, B., Ellis, D. P. W., and Berenzweig, A. 2003. Toward evaluation techniques for music similarity. In Proceedings of the Workshop on the Evaluation of Music Information Retrieval (MIR) Systems at SIGIR.
[34]
Logan, B., Kositsky, A., and Moreno, P. 2004. Semantic Analysis of Song Lyrics. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME'04).
[35]
Mahedero, J. P. G., Martínez, A., Cano, P., Koppenberger, M., and Gouyon, F. 2005. Natural language processing of lyrics. In Proceedings of the 13th ACM International Conference on Multimedia (MM'05). 475--478.
[36]
Mandel, M. I. and Ellis, D. P. W. 2007. A web-based game for collecting music metadata. In Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR'07).
[37]
McFee, B. and Lanckriet, G. 2009. Heterogeneous embedding for subjective artist similarity. In Proceedings of the 10th International Society for Music Information Retrieval Conference (ISMIR'09).
[38]
Pachet, F. and Cazaly, D. 2000. A taxonomy of musical genre. In Proceedings of Content-Based Multimedia Information Access (RIAO) Conference.
[39]
Pachet, F., Westermann, G., and Laigre, D. 2001. Musical data mining for electronic music distribution. In Proceedings of the 1st International Conference on WEB Delivering of Music (WEDELMUSIC'01).
[40]
Pampalk, E. 2006. Computational models of music similarity and their application to music information retrieval. Ph.D. thesis, Vienna University of Technology.
[41]
Pampalk, E., Flexer, A., and Widmer, G. 2005. Hierarchical organization and description of music collections at the artist level. In Proceedings of the 9th European Conference on Research and Advanced Technology for Digital Libraries (ECDL'05).
[42]
Pampalk, E. and Goto, M. 2007. MusicSun: A new approach to artist recommendation. In Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR'07).
[43]
Pampalk, E., Rauber, A., and Merkl, D. 2002. Content-based organization and visualization of music archives. In Proceedings of the 10th ACM International Conference on Multimedia (MM'02). 570--579.
[44]
Pérez-Iglesias, J., Pérez-Agüera, J. R., Fresno, V., and Feinstein, Y. Z. 2009. Integrating the probabilistic models BM25/BM25F into Lucene. CoRR abs/0911.5046.
[45]
Pohle, T. 2009. Automatic characterization of music for intuitive retrieval. Ph.D. thesis, Johannes Kepler University Linz, Austria.
[46]
Pohle, T., Knees, P., Schedl, M., Pampalk, E., and Widmer, G. 2007c. “Reinventing the Wheel”: A novel approach to music player interfaces. IEEE Trans. Multimed. 9, 567--575.
[47]
Pohle, T., Knees, P., Schedl, M., and Widmer, G. 2007a. Building an interactive next-generation artist recommender based on automatically derived high-level concepts. In Proceedings of the 5th International Workshop on Content Based Multimedia Indexing (CBMI'07).
[48]
Pohle, T., Knees, P., Schedl, M., and Widmer, G. 2007b. Meaningfully browsing music services. In Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR'07).
[49]
Robertson, S., Walker, S., and Beaulieu, M. 1999. Okapi at TREC-7: Automatic ad hoc, filtering, VLC and interactive track. In Proceedings of the 7th Text REtrieval Conference. 253--264.
[50]
Robertson, S., Walker, S., and Hancock-Beaulieu, M. 1995. Large test collection experiments on an operational, interactive system: Okapi at TREC. In Inform. Process. Manage. 31, 345--360.
[51]
Thorpe, S., Fize, D., and Marlot, C. 1996. Speed of processing in the human visual system. Nature 381, 6582, 520--522.
[52]
Sanderson, M. and Zobel, J. 2005. Information retrieval system evaluation: Effort, sensitivity, and reliability. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'05).
[53]
Schedl, M. 2008. Automatically extracting, analyzing, and visualizing information on music artists from the World Wide Web. Ph.D. thesis, Johannes Kepler University Linz, Austria.
[54]
Schedl, M. and Knees, P. 2009. Context-based music similarity estimation. In Proceedings of the 3rd International Workshop on Learning the Semantics of Audio Signals (LSAS'09).
[55]
Schedl, M., Pampalk, E., and Widmer, G. 2005. Intelligent structuring and exploration of digital music collections. e&i——Elektrotechnik und Informationstechnik 122, 7--8, 232--237.
[56]
Schedl, M. and Pohle, T. 2010. Enlightening the sun: A user interface to explore music artists via multimedia content. Multimed. Tools Appl. 49, 1, (Special Issue on Semantic and Digital Media Technologies) 101--118.
[57]
Schedl, M., Seyerlehner, K., Widmer, G., and Schiketanz, C. 2010. Three Web-based heuristics to determine a person's or institution's country of origin. In Proceedings of the 33th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'10).
[58]
Schedl, M. and Widmer, G. 2007. Automatically detecting members and instrumentation of music bands via web content mining. In Proceedings of the 5th Workshop on Adaptive Multimedia Retrieval (AMR'07).
[59]
Schedl, M., Widmer, G., Pohle, T., and Seyerlehner, K. 2007. Web-based detection of music band members and line-up. In Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR'07).
[60]
Schnitzer, D., Pohle, T., Knees, P., and Widmer, G. 2007. One-touch access to music on mobile devices. In Proceedings of the 6th International Conference on Mobile and Ubiquitous Multimedia (MUM'07).
[61]
Seyerlehner, K., Pohle, T., Schedl, M., and Widmer, G. 2007. Automatic music detection in television productions. In Proceedings of the 10th International Conference on Digital Audio Effects (DAFx'07).
[62]
Shavitt, Y. and Weinsberg, U. 2009. Songs clustering using peer-to-peer co-occurrences. In Proceedings of the IEEE International Symposium on Multimedia (ISM'09): International Workshop on Advances in Music Information Research (AdMIRe'09).
[63]
Sheskin, D. J. 2004. Handbook of Parametric and Nonparametric Statistical Procedures, 3rd Ed. Chapman & Hall/CRC, Boca Raton.
[64]
Stenzel, R. and Kamps, T. 2005. Improving content-based similarity measures by training a collaborative model. In Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR'05).
[65]
Turnbull, D., Barrington, L., and Lanckriet, G. 2008. Five approaches to collecting tags for music. In Proceedings of the 9th International Conference on Music Information Retrieval (ISMIR'08).
[66]
Turnbull, D., Liu, R., Barrington, L., and Lanckriet, G. 2007. A game-based approach for collecting semantic annotations of music. In Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR'07).
[67]
Whitman, B. 2005. Learning the meaning of music. Ph.D. thesis, School of Architecture and Planning, Massachusetts Institute of Technology, Cambridge, MA.
[68]
Whitman, B. and Lawrence, S. 2002. Inferring descriptions and similarity for music from community metadata. In Proceedings of the International Computer Music Conference (ICMC). 591--598.
[69]
Zadel, M. and Fujinaga, I. 2004. Web services for music information retrieval. In Proceedings of the 5th International Symposium on Music Information Retrieval (ISMIR'04).
[70]
Zhang, B., Shen, J., Xiang, Q., and Wang, Y. 2009. CompositeMap: A novel framework for music similarity measure. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'09). ACM, New York, NY, 403--410.
[71]
Zobel, J. and Moffat, A. 1998. Exploring the similarity space. ACM SIGIR Forum 32, 1, 18--34.
[72]
Zobel, J. and Moffat, A. 2006. Inverted files for text search engines. ACM Comput. Surv. 38, 1--56.

Cited By

View all
  • (2021)Music genre profiling based on Fisher manifolds and Probabilistic Quantum ClusteringNeural Computing and Applications10.1007/s00521-020-05499-x33:13(7521-7539)Online publication date: 1-Jul-2021
  • (2019)Global and country-specific mainstreaminess measures: Definitions, analysis, and usage for improving personalized music recommendation systemsPLOS ONE10.1371/journal.pone.021738914:6(e0217389)Online publication date: 7-Jun-2019
  • (2019)Music Playlist Recommendation with Long Short-Term MemoryDatabase Systems for Advanced Applications10.1007/978-3-030-18579-4_25(416-432)Online publication date: 22-Apr-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Information Systems
ACM Transactions on Information Systems  Volume 29, Issue 3
July 2011
134 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/1993036
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 July 2011
Accepted: 01 January 2011
Revised: 01 November 2010
Received: 01 May 2010
Published in TOIS Volume 29, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Music information retrieval
  2. Web content mining
  3. evaluation
  4. term space

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)1
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Music genre profiling based on Fisher manifolds and Probabilistic Quantum ClusteringNeural Computing and Applications10.1007/s00521-020-05499-x33:13(7521-7539)Online publication date: 1-Jul-2021
  • (2019)Global and country-specific mainstreaminess measures: Definitions, analysis, and usage for improving personalized music recommendation systemsPLOS ONE10.1371/journal.pone.021738914:6(e0217389)Online publication date: 7-Jun-2019
  • (2019)Music Playlist Recommendation with Long Short-Term MemoryDatabase Systems for Advanced Applications10.1007/978-3-030-18579-4_25(416-432)Online publication date: 22-Apr-2019
  • (2018)An Emotion-Aware Personalized Music Recommendation System Using a Convolutional Neural Networks ApproachApplied Sciences10.3390/app80711038:7(1103)Online publication date: 8-Jul-2018
  • (2017)Intelligent User Interfaces for Social Music Discovery and Exploration of Large-scale Music RepositoriesProceedings of the 2017 ACM Workshop on Theory-Informed User Modeling for Tailoring and Personalizing Interfaces10.1145/3039677.3039678(7-11)Online publication date: 13-Mar-2017
  • (2015)Directed Subset Feedback Vertex Set Is Fixed-Parameter TractableACM Transactions on Algorithms10.1145/270020911:4(1-28)Online publication date: 13-Apr-2015
  • (2015)An Association-Based Unified Framework for Mining Features and Opinion WordsACM Transactions on Intelligent Systems and Technology10.1145/26633596:2(1-21)Online publication date: 31-Mar-2015
  • (2015)Pattern Matching Techniques for Replacing Missing Sections of Audio Streamed across Wireless NetworksACM Transactions on Intelligent Systems and Technology10.1145/26633586:2(1-38)Online publication date: 31-Mar-2015
  • (2015)Computing Shortest Paths among Curved Obstacles in the PlaneACM Transactions on Algorithms10.1145/266077111:4(1-46)Online publication date: 13-Apr-2015
  • (2015)On the Performance of Smith’s Rule in Single-Machine Scheduling with Nonlinear CostACM Transactions on Algorithms10.1145/262965211:4(1-30)Online publication date: 13-Apr-2015
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media