skip to main content
10.1145/1873951.1874006acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Large-scale music tag recommendation with explicit multiple attributes

Authors Info & Claims
Published:25 October 2010Publication History

ABSTRACT

Social tagging can provide rich semantic information for large-scale retrieval in music discovery. Such collaborative intelligence, however, also generates a high degree of tags unhelpful to discovery, some of which obfuscate critical information. Towards addressing these shortcomings, tag recommendation for more robust music discovery is an emerging topic of significance for researchers. However, current methods do not consider diversity of music attributes, often using simple heuristics such as tag frequency for filtering out irrelevant tags. Music attributes encompass any number of perceived dimensions, for instance vocalness, genre, and instrumentation. Many of these are underrepresented by current tag recommenders. We propose a scheme for tag recommendation using Explicit Multiple Attributes based on tag semantic similarity and music content. In our approach, the attribute space is explicitly constrained at the outset to a set that minimizes semantic loss and tag noise, while ensuring attribute diversity. Once the user uploads or browses a song, the system recommends a list of relevant tags in each attribute independently. To the best of our knowledge, this is the first method to consider Explicit Multiple Attributes for tag recommendation. Our system is designed for large-scale deployment, on the order of millions of objects. For processing large-scale music data sets, we design parallel algorithms based on the MapReduce framework to perform large-scale music content and social tag analysis, train a model, and compute tag similarity. We evaluate our tag recommendation system on CAL-500 and a large-scale data set ($N = 77,448$ songs) generated by crawling Youtube and Last.fm. Our results indicate that our proposed method is both effective for recommending attribute-diverse relevant tags and efficient at scalable processing.

References

  1. A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Communications of the ACM, 51(1):117--122, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. T. Bertin-Mahieux, D. Eck, F. Maillet, and P. Lamere. Autotagger: A model for predicting social tags from acoustic features on large music databases. Journal of New Music Research, 37(2):115--135, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  3. G. Bradski, C.-T. Chu, A. Ng, K. Olukotun, S. K. Kim, Y.-A. Lin, and Y. Yu. Map-reduce for machine learning on multicore. In NIPS, 12/2006 2006.Google ScholarGoogle Scholar
  4. H.-M. Chen, M.-H. Chang, P.-C. Chang, M.-C. Tien, W. H. Hsu, and J.-L. Wu. Sheepdog: group and tag recommendation for flickr photos by automatic search-based learning. In MM '08: Proceeding of the 16th ACM international conference on Multimedia, pages 737--740, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. L. Cilibrasi and P. M. B. Vitanyi. The google similarity distance. IEEE Trans. on Knowl. and Data Eng., 19(3):370--383, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. In Usenix SDI, pages 137--150, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Fellbaum, editor. WordNet: an electronic lexical database. MIT Press, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  8. M. Hoffman, D. Blei, and P. Cook. Easy as cba: A simple probabilistic model for tagging music. In Proc. International Symposium on Music Information Retrieval, 2009.Google ScholarGoogle Scholar
  9. J. Li and J. Z. Wang. Real-time computerized annotation of pictures. In MULTIMEDIA '06: Proceedings of the 14th annual ACM international conference on Multimedia, pages 911--920, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Lin. Scalable language processing algorithms for the masses: a case study in computing word co-occurrence matrices with MapReduce. In EMNLP '08: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 419--428, Morristown, NJ, USA, 2008. Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Lin. Brute force and indexed approaches to pairwise document similarity comparisons with mapreduce. In SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 155--162, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Liu, X.-S. Hua, L. Yang, M. Wang, and H.-J. Zhang. Tag ranking. In WWW '09: Proceedings of the 18th International conference on World wide web, pages 351--360, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. McCreadie, C. Mcdonald, and I. Ounis. Comparing distributed indexing: To mapreduce or not? In Proceedings of the 7th Workshop on Large-Scale Distributed Systems for Information Retrieval(LSDS-IR'09) at SIGIR 2009, July 2009.Google ScholarGoogle Scholar
  14. F. Monay and D. Gatica-Perez. On image auto-annotation with latent space models. In MULTIMEDIA '03: Proceedings of the eleventh ACM international conference on Multimedia, pages 275--278, New York, NY, USA, 2003. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. R. Ness, A. Theocharis, G. Tzanetakis, and L. G. Martins. Improving automatic music tag annotation using stacked generalization of probabilistic svm outputs. In MM '09: Proceedings of the seventeen ACM international conference on Multimedia, pages 705--708, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal estimated sub-gradient solver for svm. In ICML '07: Proceedings of the 24th international conference on Machine learning, pages 807--814, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. R. Shi, C.-H. Lee, and T.-S. Chua. Enhancing image annotation by integrating concept ontology and text-based bayesian learning model. In MULTIMEDIA '07: Proceedings of the 15th international conference on Multimedia, pages 341--344, New York, NY, USA,2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. B. Sigurbjörnsson and R. van Zwol. Flickr tag recommendation based on collective knowledge. In WWW '08: Proceeding of the 17th international conference on World Wide Web, pages 327--336, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. G. Sychay, E. Chang, and K. Goh. Effective image annotation via active learning. In 2002 IEEE International Conference on Multimedia and Expo, 2002. ICME'02. Proceedings, volume 1, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  20. D. Turnbull, L. Barrington, D. Torres, and G. Lanckriet. Towards musical query-by-semantic-description using the cal500 data set. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 439--446, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. Turnbull, L. Barrington, D. Torres, and G. Lanckriet. Semantic annotation and retrieval of music and sound effects. Audio, Speech, and Language Processing, IEEE Transactions on, 16(2):467--476, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. G. Tzanetakis and P. Cook. Marsyas: a framework for audio analysis. Org. Sound, 4(3):169--175, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C. Wang, L. Zhang, and H.-J. Zhang. Learning to reduce the semantic gap in web image retrieval and annotation. In SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 355--362, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. X. J. Wang, L. Zhang, X. Li, and W. Y. Ma. Annotating images by mining image search results. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11):1919--1932, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. L. Wu, L. Yang, N. Yu, and X. S. Hua. Learning to tag. In WWW '09: Proceedings of the 18th international conference on World wide web, pages 361--370, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. B. Zhang, J. Shen, Q. Xiang, and Y. Wang. Compositemap: a novel framework for music similarity measure. In SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 403--410, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. B. Zhang, Q. Xiang, H. Lu, J. Shen, and Y. Wang. Comprehensive query-dependent fusion using regression-on-folksonomies: a case study of multimodal music search. In MM '09: Proceedings of the seventeen ACM international conference on Multimedia, pages 213--222, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Large-scale music tag recommendation with explicit multiple attributes

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          MM '10: Proceedings of the 18th ACM international conference on Multimedia
          October 2010
          1836 pages
          ISBN:9781605589336
          DOI:10.1145/1873951

          Copyright © 2010 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 25 October 2010

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate995of4,171submissions,24%

          Upcoming Conference

          MM '24
          MM '24: The 32nd ACM International Conference on Multimedia
          October 28 - November 1, 2024
          Melbourne , VIC , Australia

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader