skip to main content
10.1145/2911451.2914700acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

An Unsupervised Approach to Anomaly Detection in Music Datasets

Published:07 July 2016Publication History

ABSTRACT

This paper presents an unsupervised method for systematically identifying anomalies in music datasets. The model integrates categorical regression and robust estimation techniques to infer anomalous scores in music clips. When applied to a music genre recognition dataset, the new method is able to detect corrupted, distorted, or mislabeled audio samples based on commonly used features in music information retrieval. The evaluation results show that the algorithm outperforms other anomaly detection methods and is capable of finding problematic samples identified by human experts. The proposed method introduces a preliminary framework for anomaly detection in music data that can serve as a useful tool to improve data integrity in the future.

References

  1. C. M. Bishop. Pattern Recognition and Machine Learning (Information Science and Statistics). 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander. Lof: identifying density-based local outliers. SIGMOD Record, 29(2):93--104, May 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. V. Chandola, A. Banerjee, and V. Kumar. Anomaly detection: A survey. ACM Computer Survey, 41(3):15:1--15:58, July 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Hansen, T. Lehn-Schiøler, K. Petersen, J. Arenas-Garcia, J. Larsen, and S. Jensen. Learning and clean-up in a large scale music database. In European Signal Processing Conference (EUSIPCO), pages 946--950, 2007.Google ScholarGoogle Scholar
  5. A. Lerch. An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics. John Wiley and Sons, 2012. Google ScholarGoogle ScholarCross RefCross Ref
  6. C. Liu. Robit Regression: A Simple Robust Alternative to Logistic and Probit Regression, pages 227--238. John Wiley & Sons, Ltd, 2005.Google ScholarGoogle Scholar
  7. S. Ramaswamy, R. Rastogi, and K. Shim. Efficient algorithms for mining outliers from large data sets. SIGMOD Record, 29(2):427--438, May 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Schedl, E. Gómez, and J. Urbano. Music Information Retrieval: Recent Developments and Applications, volume 8. 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Sordo, O. Celma, M. Blech, and E. Guaus. The Quest for Musical Genres: Do the Experts and the Wisdom of Crowds Agree? In International Symposium on Music Information Retrieval, pages 255--260, 2008.Google ScholarGoogle Scholar
  10. B. L. Sturm. An analysis of the GTZAN music genre dataset. In Proceedings of the second international ACM workshop on Music Information Retrieval with user-centered and multimodal strategies, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. B. L. Sturm. The State of the Art Ten Years After a State of the Art: Future Research in Music Information Retrieval. Journal of New Music Research, 2013.Google ScholarGoogle Scholar
  12. D. E. Tyler. Robust statistics: Theory and methods. Journal of the American Statistical Association, 103:888--889, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  13. G. Tzanetakis and P. Cook. Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing, 10(5):293--302, 2002.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. An Unsupervised Approach to Anomaly Detection in Music Datasets

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
        July 2016
        1296 pages
        ISBN:9781450340694
        DOI:10.1145/2911451

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 7 July 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • short-paper

        Acceptance Rates

        SIGIR '16 Paper Acceptance Rate62of341submissions,18%Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader