Skip to main content
Log in

Music and timbre segmentation by recursive constrained K-means clustering

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Clustering of features generated of musical sound recordings proved to be beneficial for further classification tasks such as instrument recognition (Ligges and Krey in Comput Stat 26(2):279–291, 2011). We propose to use order constrained solutions in K-means clustering to stabilize the results and improve the interpretability of the clustering. With this method a further improvement of the misclassification error in the aforementioned instrument recognition task is possible. Using order constrained K-means the musical structure of a whole piece of popular music can be extracted automatically. Visualizing the distances of the feature vectors through a self distance matrix allows for an easy visual verification of the result. For the estimation of the right number of clusters, we propose to calculate the adjusted Rand indices of bootstrap samples of the data and base the decision on the minimum of a robust version of the coefficient of variation. In addition to the average stability (measured through the adjusted Rand index) this approach takes the variation between the different bootstrap samples into account.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Bischl B, Wornowizki M, Borg K (2009) The mlr package: machine learning in R. http://www.algorithm-forge.com/bischl/mlr/

  • Davis SB, Mermelstein P (1980) Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acoust Speech Signal Process ASSP 28(4)

  • Dolnicar S, Leisch F (2010) Evaluation of structure and reproducibility of cluster solutions using the bootstrap. Mark Lett 21:83–101

    Article  Google Scholar 

  • Ellis DPW (2005) PLP and RASTA (and MFCC, and inversion) in Matlab. http://www.ee.columbia.edu/dpwe/resources/matlab/rastamat/. online web resource

  • Hartigan JA, Wong MA (1979) A K-means clustering algorithm. Appl Stat 28:100–108

    MATH  Google Scholar 

  • Hoffmeister S (2009) Partitionierende Clusterverfahren unter Ordnungs-Nebenbedingungen. Diplomarbeit, Institut für Statistik, Ludwig-Maximilians-Universität München

  • Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193–218

    Article  Google Scholar 

  • Karatzoglou A, Smola A, Hornik K, Zeileis A (2004) kernlab–an S4 package for Kernel methods in R. J Stat Softw 11(9):1–20. http://www.jstatsoft.org/v11/i09/

    Google Scholar 

  • Leisch F (2006) A toolbox for K-centroids cluster analysis. Comput Stat Data Anal 51(2):526–544

    Article  MATH  MathSciNet  Google Scholar 

  • Liaw A, Wiener M (2002) Classification and regression by randomforest. R News 2(3):18–22. http://CRAN.R-project.org/doc/Rnews/

    Google Scholar 

  • Ligges U, Krey S (2011) Feature clustering for instrument classification. Comput Stat 26(2):279–291

    Article  MathSciNet  Google Scholar 

  • Lloyd SP (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28:128–137

    Article  MathSciNet  Google Scholar 

  • Opolko F, Wapnick J (1987) McGill University master samples (CDs)

  • Paulus J, Müller M, Klapuri A (2010) Audio-based music structure analysis. In: Downie S, Veltkamp RC (eds) Proceedings of the 11th international society for music information retrieval conference, pp 625–636.

  • R Development Core Team (2012) R: a language and environment for statistical computing. Vienna, Austria. http://www.R-project.org. ISBN: 3-900051-07-0

  • Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336): 846–850

    Article  Google Scholar 

  • Slaney M (1998) Auditory toolbox: a MATLAB toolbox for auditory modeling work Version 2. Technical report 1998–010. http://rvl4.ecn.purdue.edu/~malcolm/interval/1998-010/

  • Steinley D, Brusco M (2011) Choosing the number of clusters in K-means clustering. Psychol Methods 16(3):285–297

    Article  Google Scholar 

  • Steinley D, Hubert L (2008) Order-constrained solutions in K-means clustering: even better than being globally optimal. Psychometrika 73(5):647–664

    Article  MATH  MathSciNet  Google Scholar 

  • Traunmüller H (1990) Analytical expressions for the tonotopic sensory scale. J Acoust Soc Am 88:97–100

    Article  Google Scholar 

  • Walker JS (1996) Fast Fourier transforms. CRC Press, Boca Raton

    MATH  Google Scholar 

Download references

Acknowledgments

The work of Sebastian Krey has been supported by the Deutsche Forschungsgemeinschaft, Graduiertenkolleg 1032.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastian Krey.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Krey, S., Ligges, U. & Leisch, F. Music and timbre segmentation by recursive constrained K-means clustering. Comput Stat 29, 37–50 (2014). https://doi.org/10.1007/s00180-012-0358-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-012-0358-5

Keywords

Navigation