Probabilistic Segmentation of Musical Sequences Using Restricted Boltzmann Machines

Lattner, Stefan; Grachten, Maarten; Agres, Kat; Cancino Chacón, Carlos Eduardo

doi:10.1007/978-3-319-20603-5_33

Stefan Lattner⁷,
Maarten Grachten⁷,
Kat Agres⁸ &
…
Carlos Eduardo Cancino Chacón⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9110))

Included in the following conference series:

International Conference on Mathematics and Computation in Music

2236 Accesses
2 Citations
1 Altmetric

Abstract

A salient characteristic of human perception of music is that musical events are perceived as being grouped temporally into structural units such as phrases or motifs. Segmentation of musical sequences into structural units is a topic of ongoing research, both in cognitive psychology and music information retrieval. Computational models of music segmentation are typically based either on explicit knowledge of music theory or human perception, or on statistical and information-theoretic properties of musical data. The former, rule-based approach has been found to better account for (human annotated) segment boundaries in music than probabilistic approaches [14], although the statistical model proposed in [14] performs almost as well as state-of-the-art rule-based approaches. In this paper, we propose a new probabilistic segmentation method, based on Restricted Boltzmann Machines (RBM). By sampling, we determine a probability distribution over a subset of visible units in the model, conditioned on a configuration of the remaining visible units. We apply this approach to an n-gram representation of melodies, where the RBM generates the conditional probability of a note given its \(n-1\) predecessors. We use this quantity in combination with a threshold to determine the location of segment boundaries. A comparative evaluation shows that this model slightly improves segmentation performance over the model proposed in [14], and as such is closer to the state-of-the-art rule-based models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
See [18].

References

Agres, K., Abdallah, S., Pearce, M.: An information-theoretic account of musical expectation and memory
Google Scholar
Bregman, A.S.: Auditory Scene Analysis. MIT Press, Cambridge (1990)
Google Scholar
Brent, M.R.: An efficient, probabilistically sound algorithm for segmentation and word discovery. Mach. Learn. 34(1–3), 71–105 (1999)
Article MATH Google Scholar
Cambouropoulos, E.: The local boundary detection model (LBDM) and its application in the study of expressive timing. In: Proceedings of the International Computer Music Conference, San Francisco, pp. 17–22 (2001)
Google Scholar
Frankland, B.W., Cohen, A.J.: Parsing of melody: quantification and testing of the local grouping rules of Lerdahl and Jackendoff’s A Generative Theory of Tonal Music. Music Percept. 21(4), 499–543 (2004)
Article Google Scholar
Goh, H., Thome, N., Cord, M.: Biasing restricted Boltzmann machines to manipulate latent selectivity and sparsity. In: NIPS Workshop on Deep Learning and and Unsupervised Feature Learning (2010)
Google Scholar
Grachten, M., Krebs, F.: An assessment of learned score features for modeling expressive dynamics in music. IEEE Trans. Multimedia 16(5), 1211–1218 (2014). http://dx.doi.org/10.1109/TMM.2014.2311013
Article Google Scholar
Hinton, G.E., Osindero, S., Teh, Y.: A fast learning algorithm for deep belief nets. Neural Comput. 18, 1527–1554 (2006)
Article MathSciNet MATH Google Scholar
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8), 1771–1800 (2002)
Article MATH Google Scholar
Lerdahl, F., Jackendoff, R.: A Generative Theory of Tonal Music. MIT Press, Cambridge (1983)
Google Scholar
Meyer, L.: Emotion and Meaning in Music. University of Chicago Press, Chicago (1956)
Google Scholar
Narmour, E.: The Analysis and Cognition of Basic Melodic Structures: The Implication-Realization Model. University of Chicago Press, Chicago (1990)
Google Scholar
Pearce, M.T., Müllensiefen, D., Wiggins, G.: The role of expectation and probabilistic learning in auditory boundary perception: a model comparison. Perception 39(10), 1365–1391 (2010)
Article Google Scholar
Pearce, M.T., Müllensiefen, D., Wiggins, G.A.: Melodic grouping in music information retrieval: new methods and applications. In: Raś, Z.W., Wieczorkowska, A.A. (eds.) Adv. in Music Inform. Retrieval. SCI, vol. 274, pp. 364–388. Springer, Heidelberg (2010)
Chapter Google Scholar
Schaffrath, H.: The Essen folksong collection in Kern format. In: Huron, D. (ed.) Database Containing, Folksong Transcriptions in the Kern Format and A -page Research Guide Computer Database. Menlo Park, CA (1995)
Google Scholar
Temperley, D.: The Cognition of Basic Musical Structure. MIT Press, Cambridge (2001)
Google Scholar
Tenney, J., Polansky, L.: Temporal gestalt perception in music. J. Music Theor. 24(2), 205–241 (1980)
Article Google Scholar
Tieleman, T.: Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1064–1071. ACM, New York (2008)
Google Scholar
Tieleman, T., Hinton, G.: Using fast weights to improve persistent contrastive divergence. In: Proceedings of the 26th International Conference on Machine Learning, pp. 1033–1040. ACM, New York (2009)
Google Scholar
Wertheimer, M.: Laws of organization in perceptual forms. In: Ellis, W. (ed.) A Source Book of Gestalt Psychology, pp. 71–88. Harcourt, New York (1938)
Chapter Google Scholar

Download references

Acknowledgements

The project Lrn2Cre8 acknowledges the financial support of the Future and Emerging Technologies (FET) programme within the Seventh Framework Programme for Research of the European Commission, under FET grant number 610859. We thank Marcus Pearce for sharing the Essen data used in [14].

Author information

Authors and Affiliations

Austrian Research Institute for Artificial Intelligence, Vienna, Austria
Stefan Lattner, Maarten Grachten & Carlos Eduardo Cancino Chacón
Queen Mary, University of London, London, UK
Kat Agres

Authors

Stefan Lattner
View author publications
You can also search for this author in PubMed Google Scholar
Maarten Grachten
View author publications
You can also search for this author in PubMed Google Scholar
Kat Agres
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Eduardo Cancino Chacón
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stefan Lattner .

Editor information

Editors and Affiliations

De Montfort University, Leicester, United Kingdom
Tom Collins
Aalborg University, Aalborg, Denmark
David Meredith
Utrecht University, Utrecht, The Netherlands
Anja Volk

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lattner, S., Grachten, M., Agres, K., Cancino Chacón, C.E. (2015). Probabilistic Segmentation of Musical Sequences Using Restricted Boltzmann Machines. In: Collins, T., Meredith, D., Volk, A. (eds) Mathematics and Computation in Music. MCM 2015. Lecture Notes in Computer Science(), vol 9110. Springer, Cham. https://doi.org/10.1007/978-3-319-20603-5_33

Download citation

DOI: https://doi.org/10.1007/978-3-319-20603-5_33
Published: 16 June 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20602-8
Online ISBN: 978-3-319-20603-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics