Skip to main content

Chord- and Note-Based Approaches to Voice Separation

  • Chapter
  • First Online:
  • 2496 Accesses

Abstract

Voice separation is the process of assigning notes to musical voices. A fundamental question when applying machine learning to this task is the architecture of the learning model. Most existing approaches make decisions in note-to-note steps (N2N) and use heuristics to resolve conflicts arising in the process. We present here a new approach of processing in chord-to-chord steps (C2C), where a solution for a complete chord is calculated. The C2C approach has the advantage of being cognitively more plausible but it leads to feature modelling problems, while the N2N approach is computationally more efficient. We evaluate a new C2C model in comparison to an N2N model using all 19 four-voice fugues from J. S. Bach’s Well-Tempered Clavier. The overall accuracy for the C2C model turned out slightly higher but without statistical significance in our experiment. From a musical as well as a perceptual and cognitive perspective, this result indicates that feature design that makes use of the additional information available in the C2C approach is a worthwhile topic for further research.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   119.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Bellman, R. (2003). Dynamic Programming. Dover.

    Google Scholar 

  • Braun, H., Feulner, J., and Ullrich, V. (1991). Learning strategies for solving the problem of planning using backpropagation. In Proceedings of the Fourth International Conference on Neural Networks, Nimes, France.

    Google Scholar 

  • Bregman, A. S. (1994). Auditory Scene Analysis: The Perceptual Organization of Sound. MIT Press.

    Google Scholar 

  • Brown, G. J., Cooke, M., and Mousset, E. (1996). Are neural oscillations the substrate of auditory grouping. In ESCA Tutorial and Workshop on the Auditory Basis of Speech Perception, Keele University, July, pages 15–19.

    Google Scholar 

  • Cambouropoulos, E. (2000). From MIDI to traditional musical notation. In Proceedings of the AAAI Workshop on Artificial Intelligence and Music: Towards Formal Models for Composition, Performance and Analysis, Austin, TX.

    Google Scholar 

  • Cambouropoulos, E. (2008). Voice and stream: Perceptual and computational modeling of voice separation. Music Perception, 26(1):75–94.

    Google Scholar 

  • Chew, E. and Wu, X. (2004). Separating voices in polyphonic music: A contig mapping approach. InWiil, U. K., editor, Computer Music Modeling and Retrieval, volume 3310 of Lecture Notes in Computer Science, pages 1–20. Springer.

    Google Scholar 

  • Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. (2009). Introduction to Algorithms. MIT Press, 3rd edition.

    Google Scholar 

  • De Valk, R. and Weyde, T. (2015). Bringing ‘musicque into the tableture’: machine learning models for polyphonic transcription of 16th-century lute tablature. Early Music, in press.

    Google Scholar 

  • De Valk, R., Weyde, T., and Benetos, E. (2013). A machine learning approach to voice separation in lute tablature. In Proceedings of the Fourteenth International Society for Music Information Retrieval Conference (ISMIR 2013), pages 555–560, Curitiba, Brazil.

    Google Scholar 

  • Fujioka, T., Trainor, L. J., and Ross, B. (2008). Simultaneous pitches are encoded separately in auditory cortex: an MMNm study. Neuroreport, 19(3):361–366.

    Google Scholar 

  • Fujioka, T., Trainor, L. J., Ross, B., Kakigi, R., and Pantev, C. (2005). Automatic encoding of polyphonic melodies in musicians and nonmusicians. Journal of Cognitive Neuroscience, 17(10):1578–1592.

    Google Scholar 

  • Gjerdingen, R. (1994). Apparent motion in music? Music Perception, 11(4):335–370.

    Google Scholar 

  • Hörnel, D. (2004). Chordnet: Learning and producing voice leading with neural networks and dynamic programming. Journal of New Music Research, 33(4):387–397.

    Google Scholar 

  • Huron, D. (1989). Voice denumerability in polyphonic music of homogeneous timbres. Music Perception, 6(4):361–382.

    Google Scholar 

  • Huron, D. (1991a). The avoidance of part-crossing in polyphonic music: perceptual evidence and musical practice. Music Perception, 9(1):93–103.

    Google Scholar 

  • Huron, D. (1991b). Tonal consonance versus tonal fusion in polyphonic sonorities. Music Perception, 9(2):135–154.

    Google Scholar 

  • Huron, D. (2001). Tone and voice: A derivation of the rules of voice-leading from perceptual principles. Music Perception, 19(1):1–64.

    Google Scholar 

  • Huron, D. and Fantini, D. (1989). The avoidance of inner-voice entries: Perceptual evidence and musical practice. Music Perception, 7(1):43–48.

    Google Scholar 

  • Igel, C. and Hüsken, M. (2003). Empirical evaluation of the improved RPROP learning algorithms. Neurocomputing, 50:105–123.

    Google Scholar 

  • Ishigaki, A., Matsubara, M., and Saito, H. (2011). Prioritized contig combining to segregate voices in polyphonic music. In Proceedings of the Sound and Music Computing Conference. Università di Padova.

    Google Scholar 

  • Jordanous, A. (2008). Voice separation in polyphonic music: A data-driven approach. In Proceedings of the International Computer Music Conference 2008, Belfast, UK.

    Google Scholar 

  • Karydis, I., Nanopoulos, A., Papadopoulos, A., Cambouropoulos, E., and Manolopoulos, Y. (2007). Horizontal and vertical integration/segregation in auditory streaming: a voice separation algorithm for symbolic musical data. In Proceedings of the 4th Sound and Music Computing Conference (SMC ’07), Lefkada, Greece.

    Google Scholar 

  • Kilian, J. and Hoos, H. H. (2002). Voice separation—A local optimization approach. In Proceedings of the Third International Conference on Music Information Retrieval (ISMIR 2002), Paris, France.

    Google Scholar 

  • Kirlin, P. B. and Utgoff, P. E. (2005). VoiSe: Learning to segregate voices in explicit and implicit polyphony. In Proceedings of the Sixth International Conference on Music Information Retrieval (ISMIR 2005), pages 552–557, London, UK.

    Google Scholar 

  • Madsen, S. T. and Widmer, G. (2006). Separating voices in MIDI. In Proceedings of the Seventh International Conference on Music Information Retrieval (ISMIR 2006), pages 57–60, Victoria, Canada.

    Google Scholar 

  • Marsden, A. (1992). Modelling the perception of musical voices: a case study in rulebased systems. In Marsden, A. and Pople, A., editors, Computer representations and models in music, pages 239–263. Academic Press.

    Google Scholar 

  • McCabe, S. L. and Denham, M. J. (1997). A model of auditory streaming. The Journal of the Acoustical Society of America, 101(3):1611–1621.

    Google Scholar 

  • McDermott, J. H. and Oxenham, A. J. (2008). Music perception, pitch, and the auditory system. Current opinion in neurobiology, 18(4):452–463.

    Google Scholar 

  • Rafailidis, D., Cambouropoulos, E., and Manolopoulos, Y. (2009). Musical voice integration/segregation: Visa revisited. In Proceedings of the 6th Sound and Music Computing Conference, Porto, Portugal.

    Google Scholar 

  • Ragert, M., Fairhurst, M. T., and Keller, P. E. (2014). Segregation and integration of auditory streams when listening to multi-part music. PloS one, 9(1):e84085.

    Google Scholar 

  • Riedmiller, M. and Braun, H. (1993). A direct adaptive method for faster backpropagation learning: The RPROP algorithm. In IEEE International Conference on Neural Networks, pages 586–591.

    Google Scholar 

  • Shamma, S. A., Elhilali, M., and Micheyl, C. (2011). Temporal coherence and attention in auditory scene analysis. Trends in neurosciences, 34(3):114–123.

    Google Scholar 

  • Szeto, W. M. and Wong, M. H. (2006). Stream segregation algorithm for pattern matching in polyphonic music databases. Multimedia Tools and Applications, 30(1):109–127.

    Google Scholar 

  • Temperley, D. (2001). The Cognition of Basic Musical Structures. MIT Press.

    Google Scholar 

  • Turgeon, M. and Bregman, A. S. (2001). Ambiguous musical figures. Annals of the New York Academy of Sciences, 930(1):375–381.

    Google Scholar 

  • Weyde, T. and Dalinghaus, K. (2003). Design and optimization of neuro-fuzzybased recognition of musical rhythm patterns. International Journal of Smart Engineering System Design, 5(2):67–79.

    Google Scholar 

  • Wright, J. K. and Bregman, A. S. (1987). Auditory stream segregation and the control of dissonance in polyphonic music. Contemporary Music Review, 2(1):63–92.

    Google Scholar 

  • Wrigley, S. N. and Brown, G. J. (2004). A computational model of auditory selective attention. IEEE Transactions on Neural Networks, 15(5):1151–1163.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tillman Weyde .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Weyde, T., de Valk, R. (2016). Chord- and Note-Based Approaches to Voice Separation. In: Meredith, D. (eds) Computational Music Analysis. Springer, Cham. https://doi.org/10.1007/978-3-319-25931-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25931-4_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25929-1

  • Online ISBN: 978-3-319-25931-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics