Skip to main content
Log in

Music structure analysis using self-similarity matrix and two-stage categorization

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Music tends to have a distinct structure consisting of repetition and variation of components such as verse and chorus. Understanding such a music structure and its pattern has become increasingly important for music information retrieval (MIR). Thus far, many different methods for music segmentation and structure analysis have been proposed; however, each method has its advantages and disadvantages. By considering the significant variations in timbre, articulation and tempo of music, this is still a challenging task. In this paper, we propose a novel method for music segmentation and its structure analysis. For this, we first extract the timbre feature from the acoustic music signal and construct a self-similarity matrix that shows the similarities among the features within the music clip. Further, we determine the candidate boundaries for music segmentation by tracking the standard deviation in the matrix. Furthermore, we perform two-stage categorization: (i) categorization of the segments in a music clip on the basis of the timbre feature and (ii) categorization of segments in the same category on the basis of the successive chromagram features. In this way, each music clip is represented by a sequence of states where each state represents a certain category defined by two-stage categorization. We show the performance of our proposed method through experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. AllMusic. http://www.allmusic.com/. Accessed 24 October 2013.

  2. Cooper M, Foote J (2002) Automatic music summarization via similarity analysis. Proceedings of the international conference on musical information retrieval (ISMIR), pp 81–85

  3. Cooper M, Foote J (2003) Summarizing popular music via structural similarity analysis. IEEE workshop on applications of signal processing to audio and acoustics, pp 127–130

  4. Foote J (1999) Visualizing music and audio using self-similarity. Proceedings of ACM international conference on multimedia (ACM MM). pp 77–80

  5. Foote J (2000) Automatic audio segmentation using a measure of audio novelty. Proceedings of IEEE international conference on multimedia and expo (ICME), vol. 1. pp 452–455

  6. Fujishima T (1999) Realtime chord recognition of musical sound: a system using common lisp music. Proceedings of international computer music conference (ICMC), pp 464–467

  7. Jun S, Hwang E (2013) Music segmentation and summarization based on self-similarity matrix. Proceedings of the 7th international conference on ubiquitous information management and communication. 82:1–4

  8. Jun S, Rho S, Hwang E (2010) Music retrieval and recommendation scheme based on varying mood sequences. Int J Semant Web Inf Syst 6(2):1–16. doi:10.4018/jswis.2010040101

    Article  Google Scholar 

  9. Kaiser F, Sikora T (2010) Music structure discovery in popular music using non-negative matrix factorization. Proceedings of international conference on music information retrieval (ISMIR), pp 429–434

  10. Klapuri A (1999) Sound onset detection by applying psychoacoustic knowledge. Proceedings of IEEE international conference on acoustics, speech, and signal, vol.6. pp 3089–3092

  11. Logan B (2000) Mel frequency cepstral coefficients for music modeling. Proceedings of international conference on music information retrieval (ISMIR)

  12. Lu L, Wang M, Zhang H-J (2004) Repeating pattern discovery and structure analysis from acoustic music data. ACM SIGMM international workshop on multimedia information retrieval, pp 275–282

  13. Maddage NC, Xu C, Kankanhalli MS, Shao X (2004) Content-based music structure analysis with applications to music semantics understanding. Proceedings of ACM international conference on multimedia (ACM MM), pp 112–119

  14. Paulus J, Klapuri A (2009) Music structure analysis using a probabilistic fitness measure and a greedy search algorithm. IEEE Trans Audio Speech Lang Process 17:1159–1170. doi:10.1109/TASL.2009.2020533

    Article  Google Scholar 

  15. Peeters G (2004) Deriving musical structures from signal analysis for music audio summary generation: “Sequence” and “State” approach. Computer music modeling and retrieval. Springer Berlin, Heidelberg, pp 169–185

    Google Scholar 

  16. Peeters G (2007) Sequence representation of music structure using higher-order similarity matrix and maximum-likelihood approach. Proceedings of the international conference on musical information retrieval (ISMIR), pp 35–40

  17. Rabiner L, Juang B-H (1993) Fundamentals of speech recognition. Prentice Hall

  18. Serrà J, Müller M, Grosche P, Arcos JL (2012) Unsupervised detection of music boundaries by time series structure features. Proceedings of twenty-Sixth AAAI Conference on Artificial Intelligence, pp 1613–1619

  19. Tzanetakis G, Cook P (2002) Musical genre classification of audio signals. IEEE Trans Speech Audio Process 10:293–302. doi:10.1109/TSA.2002.800560

    Article  Google Scholar 

  20. Wang M, Lu L, Zhang H-J (2004) Repeating pattern discovery from acoustic musical signals. Proceedings of IEEE international conference on multimedia and expo (ICME), vol. 3, pp 2019–2022

Download references

Acknowledgments

This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education (NRF-2013R1A1A2012627) and the MSIP(Ministry of Science, ICT&Future Planning), Korea, under the C-ITRC(Convergence Information Technology Research Center) support program (NIPA-2013-H0301-13-3006) supervised by the NIPA(National IT Industry Promotion Agency).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eenjun Hwang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jun, S., Rho, S. & Hwang, E. Music structure analysis using self-similarity matrix and two-stage categorization. Multimed Tools Appl 74, 287–302 (2015). https://doi.org/10.1007/s11042-013-1761-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-013-1761-9

Keywords

Navigation