Abstract
We introduce the MIR task of segmenting melodies into phrases, summarise the musicological and psychological background to the task and review existing computational methods before presenting a new model, IDyOM, for melodic segmentation based on statistical learning and information-dynamic analysis. The performance of the model is compared to several existing algorithms in predicting the annotated phrase boundaries in a large corpus of folk music. The results indicate that four algorithms produce acceptable results: one of these is the IDyOM model which performs much better than naive statistical models and approaches the performance of the best-performing rule-based models. Further slight performance improvement can be obtained by combining the output of the four algorithms in a hybrid model, although the performance of this model is moderate at best, leaving a great deal of room for improvement on this task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abdallah, S., Plumbley, M.: Information dynamics: Patterns of expectation and surprise in the perception of music. Connection Science 21(2-3), 89–117 (2009)
Abdallah, S., Sandler, M., Rhodes, C., Casey, M.: Using duration models to reduce fragmentation in audio segmentation. Machine Learning 65(2-3), 485–515 (2006)
Ahlbäck, S.: Melody beyond notes: A study of melody cognition. Doctoral dissertation, Göteborg University, Göteborg, Sweden (2004)
Allan, L.G.: The perception of time. Perception and Psychophysics 26(5), 340–354 (1979)
Barlow, H., Morgenstern, S.: A dictionary of musical themes. Ernest Benn (1949)
Bell, T.C., Cleary, J.G., Witten, I.H.: Text Compression. Prentice Hall, Englewood Cliffs (1990)
Bod, R.: Beyond Grammar: An experience-based theory of language. CSLI Publications, Standford (1998)
Bod, R.: Memory-based models of melodic analysis: Challenging the Gestalt principles. Journal of New Music Research 30(3), 27–37 (2001)
Bower, G.: Organizational factors in memory. Cognitive Psychology 1, 18–46 (1970)
Bregman, A.S.: Auditory Scene Analysis: The perceptual organization of sound. MIT Press, Cambridge (1990)
Brent, M.R.: An efficient, probabilistically sound algorithm for segmentation and word discovery. Machine Learning 34(1-3), 71–105 (1999a)
Brent, M.R.: Speech segmentation and word discovery: A computational perspective. Trends in Cognitive Science 3, 294–301 (1999b)
Brochard, R., Dufour, A., Drake, C., Scheiber, C.: Functional brain imaging of rhythm perception. In: Woods, C., Luck, G., Brochard, R., Seddon, F., Sloboda, J.A. (eds.) Proceedings of the Sixth International Conference of Music Perception and Cognition. University of Keele, Keele (2000)
Bruderer, M.J.: Perception and Modeling of Segment Boundaries in Popular Music. Doctoral dissertation, J.F. Schouten School for User-System Interaction Research, Technische Universiteit Eindhoven, Nederlands (2008)
Bunton, S.: Semantically motivated improvements for PPM variants. The Computer Journal 40(2/3), 76–93 (1997)
Cambouropoulos, E.: The local boundary detection model (LBDM) and its application in the study of expressive timing. In: Proceedings of the International Computer Music Conference, ICMA, San Francisco, pp. 17–22 (2001)
Cambouropoulos, E.: Musical parallelism and melodic segmentation: A computational approach. Music Perception 23(3), 249–269 (2006)
Chater, N.: Reconciling simplicity and likelihood principles in perceptual organisation. Psychological Review 103(3), 566–581 (1996)
Chater, N.: The search for simplicity: A fundamental cognitive principle? The Quarterly Journal of Experimental Psychology 52A(2), 273–302 (1999)
Clarke, E.F., Krumhansl, K.L.: Perceiving musical time. Music Perception 7(3), 213–252 (1990)
Cleary, J.G., Teahan, W.J.: Unbounded length contexts for PPM. The Computer Journal 40(2/3), 67–75 (1997)
Cohen, P.R., Adams, N., Heeringa, B.: Voting experts: An unsupervised algorithm for segmenting sequences. Intelligent Data Analysis 11(6), 607–625 (2007)
Collins, M.: Head-Driven Statistical Models for Natural Language Parsing. Doctoral dissertation, Department of Computer and Information Science, University of Pennsylvania, USA (1999)
Conklin, D., Witten, I.H.: Multiple viewpoint systems for music prediction. Journal of New Music Research 24(1), 51–73 (1995)
de Nooijer, J., Wiering, F., Volk, A., Tabachneck-Schijf, H.J.M.: An experimental comparison of human and automatic music segmentation. In: Miyazaki, K., Adachi, M., Hiraga, Y., Nakajima, Y., Tsuzaki, M. (eds.) Proceedings of the 10th International Conference on Music Perception and Cognition, pp. 399–407. Causal Productions, Adelaide (2008)
Deliège, I.: Grouping conditions in listening to music: An approach to Lerdahl and Jackendoff’s grouping preference rules. Music Perception 4(4), 325–360 (1987)
Dowling, W.J.: Rhythmic groups and subjective chunks in memory for melodies. Perception and Psychophysics 14(1), 37–40 (1973)
Elman, J.L.: Finding structure in time. Cognitive Science 14, 179–211 (1990)
Ferrand, M., Nelson, P., Wiggins, G.: Memory and melodic density: a model for melody segmentation. In: Bernardini, N.G.F., Giosmin, N. (eds.) Proceedings of the XIV Colloquium on Musical Informatics, Firenze, Italy, pp. 95–98 (2003)
Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psychological Bulletin 76(5), 378–382 (1971)
Fodor, J.A., Bever, T.G.: The psychological reality of linguistic segments. Journal of Verbal Learning and Verbal Behavior 4, 414–420 (1965)
Frankland, B.W., Cohen, A.J.: Parsing of melody: Quantification and testing of the local grouping rules of Lerdahl and Jackendoff’s A Generative Theory of Tonal Music. Music Perception 21(4), 499–543 (2004)
Gjerdingen, R.O.: Apparent motion in music? In: Griffith, N., Todd, P.M. (eds.) Musical Networks: Parallel Distributed Perception and Performance, pp. 141–173. MIT Press/Bradford Books, Cambridge (1999)
Green, D., Swets, J.: Signal Detection Theory and Psychophysics. Wiley, New York (1966)
Gregory, A.H.: Perception of clicks in music. Perception and Psychophysics 24(2), 171–174 (1978)
Hale, J.: Uncertainty about the rest of the sentence. Cognitive Science 30(4), 643–672 (2006)
Howell, D.C.: Statistical methods for pscyhology. Duxbury, Pacific Grove (2002)
Jackendoff, R.: Consciousness and the Computational Mind. MIT Press, Cambridge (1987)
Jusczyk, P.W.: The Discovery of Spoken Language. MIT Press, Cambridge (1997)
Koffka, K.: Principles of Gestalt Psychology. Harcourt, Brace and World, New York (1935)
Kohavi, R.: Wrappers for Performance Enhancement and Oblivious Decision Graphs. Doctoral dissertation, Department of Computer Science, Stanford University, USA (1995)
Ladefoged, P., Broadbent, D.E.: Perception of sequences in auditory events. Journal of Experimental Psychology 12, 162–170 (1960)
Larsson, N.J.: Extended application of suffix trees to data compression. In: Storer, J.A., Cohn, M. (eds.) Proceedings of the IEEE Data Compression Conference, pp. 190–199. IEEE Computer Society Press, Washington (1996)
Lerdahl, F., Jackendoff, R.: A Generative Theory of Tonal Music. MIT Press, Cambridge (1983)
Levy, R.: Expectation-based syntactic comprehension. Cognition 16(3), 1126–1177 (2008)
Liegeoise-Chauvel, C., Peretz, I., Babai, M., Laguitton, V., Chauvel, P.: Contribution of different cortical areas in the temporal lobes to music processing. Brain 121(10), 1853–1867 (1998)
MacKay, D.J.C.: Information Theory, Inference, and Learning Algorithms. Cambridge University Press, Cambridge (2003)
MacWhinney, B., Snow, C.: The child language data exchange system. Journal of Child Language 12, 271–296 (1985)
Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
Melucci, M., Orio, N.: A comparison of manual and automatic melody segmentation. In: Fingerhut, M. (ed.) Proceedings of the Third International Conference on Music Information Retrieval, pp. 7–14. IRCAM, Paris (2002)
Meyer, L.B.: Meaning in music and information theory. Journal of Aesthetics and Art Criticism 15(4), 412–424 (1957)
Narmour, E.: The Analysis and Cognition of Basic Melodic Structures: The Implication-realisation Model. University of Chicago Press, Chicago (1990)
Narmour, E.: The Analysis and Cognition of Melodic Complexity: The Implication-realisation Model. University of Chicago Press, Chicago (1992)
Pearce, M.T., Conklin, D., Wiggins, G.A.: Methods for combining statistical models of music. In: Wiil, U.K. (ed.) Computer Music Modelling and Retrieval, pp. 295–312. Springer, Berlin (2005)
Pearce, M.T., Wiggins, G.A.: Improved methods for statistical modelling of monophonic music. Journal of New Music Research 33(4), 367–385 (2004)
Peretz, I.: Clustering in music: An appraisal of task factors. International Journal of Psychology 24(2), 157–178 (1989)
Peretz, I.: Processing of local and global musical information by unilateral brain-damaged patients. Brain 113(4), 1185–1205 (1990)
Rabiner, L.R.: A tutorial on Hidden Markov Models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–285 (1989)
RISM-ZENTRALREDAKTION. Répertoire international des sources musicales (rism)
Saffran, J.R.: Absolute pitch in infancy and adulthood: The role of tonal structure. Developmental Science 6(1), 37–49 (2003)
Saffran, J.R., Aslin, R.N., Newport, E.L.: Statistical learning by 8-month old infants. Science 274, 1926–1928 (1996)
Saffran, J.R., Griepentrog, G.J.: Absolute pitch in infant auditory learning: Evidence for developmental reorganization. Developmental Psychology 37(1), 74–85 (2001)
Saffran, J.R., Johnson, E.K., Aslin, R.N., Newport, E.L.: Statistical learning of tone sequences by human infants and adults. Cognition 70(1), 27–52 (1999)
Schaffrath, H.: The Essen folksong collection. In: Huron, D. (ed.) Database containing 6,255 folksong transcriptions in the Kern format and a 34-page research guide [computer database]. CCARH, Menlo Park (1995)
Schapire, R.E.: The boosting approach to machine learning: An overview. In: Denison, D.D., Hansen, M.H., Holmes, C., Mallick, B., Yu, B. (eds.) Nonlinear Estimation and Classification. Springer, Berlin (2003)
Sloboda, J.A., Gregory, A.H.: The psychological reality of musical segments. Canadian Journal of Psychology 34(3), 274–280 (1980)
Sokolova, M., Lapalme, G.: Performance measures in classification of human communications. In: Kobti, Z., Wu, D. (eds.) Canadian AI 2007. LNCS (LNAI), vol. 4509, pp. 159–170. Springer, Heidelberg (2007)
Stoffer, T.H.: Representation of phrase structure in the perception of music. Music Perception 3(2), 191–220 (1985)
Tan, N., Aiello, R., Bever, T.G.: Harmonic structure as a determinant of melodic organization. Memory and Cognition 9(5), 533–539 (1981)
Temperley, D.: The Cognition of Basic Musical Structures. MIT Press, Cambridge (2001)
Tenney, J., Polansky, L.: Temporal Gestalt perception in music. Contemporary Music Review 24(2), 205–241 (1980)
Thom, B., Spevak, C., Höthker, K.: Melodic segmentation: Evaluating the performance of algorithms and musical experts. In: Proceedings of the International Computer Music Conference, pp. 65–72. ICMA, San Francisco (2002)
Todd, N.P.M.: The auditory “primal sketch”: A multiscale model of rhythmic grouping. Journal of New Music Research 23(1), 25–70 (1994)
Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14(3), 249–260 (1995)
Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S. Springer, New York (2002)
Waugh, N., Norman, D.A.: Primary memory. Psychological Review 72, 89–104 (1965)
Witten, I.H., Bell, T.C.: The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression. IEEE Transactions on Information Theory 37(4), 1085–1094 (1991)
Witten, I.H., Frank, E. (eds.): Data mining: Practical machine learning tools and techniques with Java implementations. Morgan Kaufmann, San Francisco (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Pearce, M.T., Müllensiefen, D., Wiggins, G.A. (2010). Melodic Grouping in Music Information Retrieval: New Methods and Applications. In: Raś, Z.W., Wieczorkowska, A.A. (eds) Advances in Music Information Retrieval. Studies in Computational Intelligence, vol 274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11674-2_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-11674-2_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11673-5
Online ISBN: 978-3-642-11674-2
eBook Packages: EngineeringEngineering (R0)