Abstract
Voice separation is the process of assigning notes to musical voices. A fundamental question when applying machine learning to this task is the architecture of the learning model. Most existing approaches make decisions in note-to-note steps (N2N) and use heuristics to resolve conflicts arising in the process. We present here a new approach of processing in chord-to-chord steps (C2C), where a solution for a complete chord is calculated. The C2C approach has the advantage of being cognitively more plausible but it leads to feature modelling problems, while the N2N approach is computationally more efficient. We evaluate a new C2C model in comparison to an N2N model using all 19 four-voice fugues from J. S. Bach’s Well-Tempered Clavier. The overall accuracy for the C2C model turned out slightly higher but without statistical significance in our experiment. From a musical as well as a perceptual and cognitive perspective, this result indicates that feature design that makes use of the additional information available in the C2C approach is a worthwhile topic for further research.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bellman, R. (2003). Dynamic Programming. Dover.
Braun, H., Feulner, J., and Ullrich, V. (1991). Learning strategies for solving the problem of planning using backpropagation. In Proceedings of the Fourth International Conference on Neural Networks, Nimes, France.
Bregman, A. S. (1994). Auditory Scene Analysis: The Perceptual Organization of Sound. MIT Press.
Brown, G. J., Cooke, M., and Mousset, E. (1996). Are neural oscillations the substrate of auditory grouping. In ESCA Tutorial and Workshop on the Auditory Basis of Speech Perception, Keele University, July, pages 15–19.
Cambouropoulos, E. (2000). From MIDI to traditional musical notation. In Proceedings of the AAAI Workshop on Artificial Intelligence and Music: Towards Formal Models for Composition, Performance and Analysis, Austin, TX.
Cambouropoulos, E. (2008). Voice and stream: Perceptual and computational modeling of voice separation. Music Perception, 26(1):75–94.
Chew, E. and Wu, X. (2004). Separating voices in polyphonic music: A contig mapping approach. InWiil, U. K., editor, Computer Music Modeling and Retrieval, volume 3310 of Lecture Notes in Computer Science, pages 1–20. Springer.
Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. (2009). Introduction to Algorithms. MIT Press, 3rd edition.
De Valk, R. and Weyde, T. (2015). Bringing ‘musicque into the tableture’: machine learning models for polyphonic transcription of 16th-century lute tablature. Early Music, in press.
De Valk, R., Weyde, T., and Benetos, E. (2013). A machine learning approach to voice separation in lute tablature. In Proceedings of the Fourteenth International Society for Music Information Retrieval Conference (ISMIR 2013), pages 555–560, Curitiba, Brazil.
Fujioka, T., Trainor, L. J., and Ross, B. (2008). Simultaneous pitches are encoded separately in auditory cortex: an MMNm study. Neuroreport, 19(3):361–366.
Fujioka, T., Trainor, L. J., Ross, B., Kakigi, R., and Pantev, C. (2005). Automatic encoding of polyphonic melodies in musicians and nonmusicians. Journal of Cognitive Neuroscience, 17(10):1578–1592.
Gjerdingen, R. (1994). Apparent motion in music? Music Perception, 11(4):335–370.
Hörnel, D. (2004). Chordnet: Learning and producing voice leading with neural networks and dynamic programming. Journal of New Music Research, 33(4):387–397.
Huron, D. (1989). Voice denumerability in polyphonic music of homogeneous timbres. Music Perception, 6(4):361–382.
Huron, D. (1991a). The avoidance of part-crossing in polyphonic music: perceptual evidence and musical practice. Music Perception, 9(1):93–103.
Huron, D. (1991b). Tonal consonance versus tonal fusion in polyphonic sonorities. Music Perception, 9(2):135–154.
Huron, D. (2001). Tone and voice: A derivation of the rules of voice-leading from perceptual principles. Music Perception, 19(1):1–64.
Huron, D. and Fantini, D. (1989). The avoidance of inner-voice entries: Perceptual evidence and musical practice. Music Perception, 7(1):43–48.
Igel, C. and Hüsken, M. (2003). Empirical evaluation of the improved RPROP learning algorithms. Neurocomputing, 50:105–123.
Ishigaki, A., Matsubara, M., and Saito, H. (2011). Prioritized contig combining to segregate voices in polyphonic music. In Proceedings of the Sound and Music Computing Conference. Università di Padova.
Jordanous, A. (2008). Voice separation in polyphonic music: A data-driven approach. In Proceedings of the International Computer Music Conference 2008, Belfast, UK.
Karydis, I., Nanopoulos, A., Papadopoulos, A., Cambouropoulos, E., and Manolopoulos, Y. (2007). Horizontal and vertical integration/segregation in auditory streaming: a voice separation algorithm for symbolic musical data. In Proceedings of the 4th Sound and Music Computing Conference (SMC ’07), Lefkada, Greece.
Kilian, J. and Hoos, H. H. (2002). Voice separation—A local optimization approach. In Proceedings of the Third International Conference on Music Information Retrieval (ISMIR 2002), Paris, France.
Kirlin, P. B. and Utgoff, P. E. (2005). VoiSe: Learning to segregate voices in explicit and implicit polyphony. In Proceedings of the Sixth International Conference on Music Information Retrieval (ISMIR 2005), pages 552–557, London, UK.
Madsen, S. T. and Widmer, G. (2006). Separating voices in MIDI. In Proceedings of the Seventh International Conference on Music Information Retrieval (ISMIR 2006), pages 57–60, Victoria, Canada.
Marsden, A. (1992). Modelling the perception of musical voices: a case study in rulebased systems. In Marsden, A. and Pople, A., editors, Computer representations and models in music, pages 239–263. Academic Press.
McCabe, S. L. and Denham, M. J. (1997). A model of auditory streaming. The Journal of the Acoustical Society of America, 101(3):1611–1621.
McDermott, J. H. and Oxenham, A. J. (2008). Music perception, pitch, and the auditory system. Current opinion in neurobiology, 18(4):452–463.
Rafailidis, D., Cambouropoulos, E., and Manolopoulos, Y. (2009). Musical voice integration/segregation: Visa revisited. In Proceedings of the 6th Sound and Music Computing Conference, Porto, Portugal.
Ragert, M., Fairhurst, M. T., and Keller, P. E. (2014). Segregation and integration of auditory streams when listening to multi-part music. PloS one, 9(1):e84085.
Riedmiller, M. and Braun, H. (1993). A direct adaptive method for faster backpropagation learning: The RPROP algorithm. In IEEE International Conference on Neural Networks, pages 586–591.
Shamma, S. A., Elhilali, M., and Micheyl, C. (2011). Temporal coherence and attention in auditory scene analysis. Trends in neurosciences, 34(3):114–123.
Szeto, W. M. and Wong, M. H. (2006). Stream segregation algorithm for pattern matching in polyphonic music databases. Multimedia Tools and Applications, 30(1):109–127.
Temperley, D. (2001). The Cognition of Basic Musical Structures. MIT Press.
Turgeon, M. and Bregman, A. S. (2001). Ambiguous musical figures. Annals of the New York Academy of Sciences, 930(1):375–381.
Weyde, T. and Dalinghaus, K. (2003). Design and optimization of neuro-fuzzybased recognition of musical rhythm patterns. International Journal of Smart Engineering System Design, 5(2):67–79.
Wright, J. K. and Bregman, A. S. (1987). Auditory stream segregation and the control of dissonance in polyphonic music. Contemporary Music Review, 2(1):63–92.
Wrigley, S. N. and Brown, G. J. (2004). A computational model of auditory selective attention. IEEE Transactions on Neural Networks, 15(5):1151–1163.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Weyde, T., de Valk, R. (2016). Chord- and Note-Based Approaches to Voice Separation. In: Meredith, D. (eds) Computational Music Analysis. Springer, Cham. https://doi.org/10.1007/978-3-319-25931-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-25931-4_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25929-1
Online ISBN: 978-3-319-25931-4
eBook Packages: Computer ScienceComputer Science (R0)