Abstract
Is there a general mechanism that governs the perception of phrase structure in music and language? While it is usually assumed that humans have separate faculties for music and language, this work focuses on the commonalities rather than on the differences between these modalities, aiming at finding a deeper “faculty”. We present a series of data-oriented parsing (DOP) models which aim at balancing the simplest structure with the most likely structure of an input. Experiments with the Essen Folksong Collection and the Penn Treebank show that exactly the same model with the same parameter setting achieves maximum parse accuracy for both music and language. This suggests an interesting parallel between musical and linguistic processing. We show that our results outperform both the melodic component of Temperley (2001) and the musical parser of Bod (2001b).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Black, E. et al. (1991). iA Procedure for Quantitatively Comparing the Syntactic Coverage of English, Proceedings DARPA Speech and Natural Language Workshop, Morgan Kaufmann.
Bod, R. (1993). Using an Annotated Language Corpus as a Virtual Stochastic Grammar, Proceedings AAAI-93, Menlo Park, Ca.
Bod, R. (1998). Beyond Grammar: An Experience-Based Theory of Language, Stanford: CSLI Publications (Lecture notes number 88), distributed by Cambridge University Press.
Bod, R. (2000). Parsing with the Shortest Derivation. Proceedings COLING-2000, Germany.
Bod, R. (2001a). What is the Minimal Set of Fragments that Achieves Maximal Parse Accuracy? Proceedings ACL’2001, Toulouse, France.
Bod, R. (2001b). A Memory-Based Model for Music Analysis. Proceedings International Computer Music Conference (ICMC’2001), Havana, Cuba.
Bod, R. (2001c). Memory-Based Models of Melodic Analysis: Challenging the Gestalt Principles. Journal of New Music Research, 31(1), in press. (available at http://turing.wins.uva.nl/~rens/jnmr01.pdf)
Bod, R. (2002). Combining Simplicity and Likelihood in Language and Music. Proceedings CogSci’2002. Fairfax, Virginia.
Bod, R., J. Hay and S. Jannedy (eds.) (2002a). Probabilistic Linguistics. Cambridge, The MIT Press. (in press)
Bod, R., R. Scha and K. Sima’an (eds.) (2002b). Data-Oriented Parsing. Stanford, CSLI Publications. (in press)
Buffart, H., E. Leeuwenberg and F. Restle (1983). Analysis of Ambiguity in Visual Pattern Completion. Journal of Experimental Psychology 9, 980–1000.
Charniak, E. (1997). Statistical Techniques for Natural Language Parsing, AI Magazine, Winter 1997, 32–43.
Charniak, E. (2000). A Maximum-Entropy-Inspired Parser. Proceedings ANLP-NAACL’2000.
Chater, N. (1999). The Search for Simplicity: A Fundamental Cognitive Principle? The Quarterly Journal of Experimental Psychology, 52A(2), 273–302.
Chomsky, N. (1965). Aspects of the Theory of Syntax, Cambridge, The MIT Press.
Collard, R., P. Vos and E. Leeuwenberg, (1981). What Melody Tells about Metre in Music. Zeitschrift fŸr Psychologie. 189, 25–33.
Collins, M. (2000). Discriminative Reranking for Natural Language Parsing, Proceedings ICML-2000, Stanford, Ca.
Collins, M. and N. Duffy (2002). New Ranking Algorithms for Parsing and Tagging: Kernels over Discrete Structures, and the Voted Perceptron. Proceedings ACL’2002, Philadelphia.
Frazier, L. (1978). On Comprehending Sentences: Syntactic Parsing Strategies. PhD. Thesis, University of Connecticut.
Goodman, J. (2002). Efficient Parsing of DOP with PCFG-Reductions. In R. Bod et al. 2002b. von Helmholtz, H. (1910). Treatise on Physiological Optics (Vol. 3), Dover, New York.
Jurafsky, D. (2002). Probabilistic Modeling in Psycholinguistics: Comprehension and Production. In R. Bod et al. 2002a.
Kersten, D. (1999). High-level vision as statistical inference. In S. Gazzaniga (ed.), The New Cognitive Neurosciences, Cambridge, The MIT Press.
Leeuwenberg, E. (1971). A Perceptual Coding Language for Perceptual and Auditory Patterns. American Journal of Psychology. 84, 307–349.
Lerdahl, F. and R. Jackendoff (1983). A Generative Theory of Tonal Music. The MIT Press.
Longuet-Higgins, H. (1976). Perception of Melodies. Nature 263, October 21, 646–653.
Longuet-Higgins, H. and C. Lee, (1987). The Rhythmic Interpretation of Monophonic Music. In: Mental Processes: Studies in Cognitive Science, Cambridge, The MIT Press.
Manning, C. and H. Schütze (1999). Foundations of Statistical Natural Language Processing. Cambridge, The MIT Press.
Marcus, M., B. Santorini and M. Marcinkiewicz (1993). Building a Large Annotated Corpus of English: the Penn Treebank, Computational Linguistics 19(2).
Marr, D. (1982). Vision. San Francisco, Freeman.
Mumford, D. (1999). The dawning of the age of stochasticity. Based on a lecture at the Accademia Nazionale dei Lincei. (available at http://www.dam.brown.edu/people/mumford/Papers/Dawning.ps)
Osborne, M. (1999). Minimal description length-based induction of definite clause grammars for noun phrase identification. Proceedings EACL Workshop on Computational Natural Language Learning. Bergen, Norway.
Palmer, S. (1977). Hierarchical Structure in Perceptual Representation. Cognitive Psychology, 9, 441–474.
Raphael, C. (1999). Automatic Segmentation of Acoustic Musical Signals Using Hidden Markov Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21, 360-370.
Rissanen, J. (1989). Stochastic Complexity in Statistical Inquiry. Series in Computer Science—Volume 15. World Scientific, 1989.
Saffran, J., M. Loman and R. Robertson (2000). Infant Memory for Musical Experiences. Cognition 77, B 16–23.
Schaffrath, H. (1995). The Essen Folksong Collection in the Humdrum Kern Format. D. Huron (ed.). Menlo Park, CA: Center for Computer Assisted Research in the Humanities.
Shannon, C. (1948). A Mathematical Theory of Communication. Bell System Technical Journal. 27, 379–423, 623-656.
Simon, H. (1972). Complexity and the Representation of Patterned Sequences of Symbols. Psychological Review. 79, 369–382.
Temperley, D. (2001). The Cognition of Basic Musical Structures. Cambridge, The MIT Press.
Wertheimer, M. (1923). Untersuchungen zur Lehre von der Gestalt. Psychologische Forschung 4, 301–350.
Wundt, W. (1901). Sprachgeschichte und Sprachpsychologie. Engelmann, Leipzig.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bod, R. (2002). A General Parsing Model for Music and Language. In: Anagnostopoulou, C., Ferrand, M., Smaill, A. (eds) Music and Artificial Intelligence. ICMAI 2002. Lecture Notes in Computer Science(), vol 2445. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45722-4_3
Download citation
DOI: https://doi.org/10.1007/3-540-45722-4_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44145-8
Online ISBN: 978-3-540-45722-0
eBook Packages: Springer Book Archive