Chord- and Note-Based Approaches to Voice Separation

Weyde, Tillman; de Valk, Reinier

doi:10.1007/978-3-319-25931-4_6

Chord- and Note-Based Approaches to Voice Separation

Tillman Weyde² &
Reinier de Valk²

Chapter
First Online: 28 October 2015

2496 Accesses

Abstract

Voice separation is the process of assigning notes to musical voices. A fundamental question when applying machine learning to this task is the architecture of the learning model. Most existing approaches make decisions in note-to-note steps (N2N) and use heuristics to resolve conflicts arising in the process. We present here a new approach of processing in chord-to-chord steps (C2C), where a solution for a complete chord is calculated. The C2C approach has the advantage of being cognitively more plausible but it leads to feature modelling problems, while the N2N approach is computationally more efficient. We evaluate a new C2C model in comparison to an N2N model using all 19 four-voice fugues from J. S. Bach’s Well-Tempered Clavier. The overall accuracy for the C2C model turned out slightly higher but without statistical significance in our experiment. From a musical as well as a perceptual and cognitive perspective, this result indicates that feature design that makes use of the additional information available in the C2C approach is a worthwhile topic for further research.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.99; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bellman, R. (2003). Dynamic Programming. Dover.
Google Scholar
Braun, H., Feulner, J., and Ullrich, V. (1991). Learning strategies for solving the problem of planning using backpropagation. In Proceedings of the Fourth International Conference on Neural Networks, Nimes, France.
Google Scholar
Bregman, A. S. (1994). Auditory Scene Analysis: The Perceptual Organization of Sound. MIT Press.
Google Scholar
Brown, G. J., Cooke, M., and Mousset, E. (1996). Are neural oscillations the substrate of auditory grouping. In ESCA Tutorial and Workshop on the Auditory Basis of Speech Perception, Keele University, July, pages 15–19.
Google Scholar
Cambouropoulos, E. (2000). From MIDI to traditional musical notation. In Proceedings of the AAAI Workshop on Artificial Intelligence and Music: Towards Formal Models for Composition, Performance and Analysis, Austin, TX.
Google Scholar
Cambouropoulos, E. (2008). Voice and stream: Perceptual and computational modeling of voice separation. Music Perception, 26(1):75–94.
Google Scholar
Chew, E. and Wu, X. (2004). Separating voices in polyphonic music: A contig mapping approach. InWiil, U. K., editor, Computer Music Modeling and Retrieval, volume 3310 of Lecture Notes in Computer Science, pages 1–20. Springer.
Google Scholar
Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. (2009). Introduction to Algorithms. MIT Press, 3rd edition.
Google Scholar
De Valk, R. and Weyde, T. (2015). Bringing ‘musicque into the tableture’: machine learning models for polyphonic transcription of 16th-century lute tablature. Early Music, in press.
Google Scholar
De Valk, R., Weyde, T., and Benetos, E. (2013). A machine learning approach to voice separation in lute tablature. In Proceedings of the Fourteenth International Society for Music Information Retrieval Conference (ISMIR 2013), pages 555–560, Curitiba, Brazil.
Google Scholar
Fujioka, T., Trainor, L. J., and Ross, B. (2008). Simultaneous pitches are encoded separately in auditory cortex: an MMNm study. Neuroreport, 19(3):361–366.
Google Scholar
Fujioka, T., Trainor, L. J., Ross, B., Kakigi, R., and Pantev, C. (2005). Automatic encoding of polyphonic melodies in musicians and nonmusicians. Journal of Cognitive Neuroscience, 17(10):1578–1592.
Google Scholar
Gjerdingen, R. (1994). Apparent motion in music? Music Perception, 11(4):335–370.
Google Scholar
Hörnel, D. (2004). Chordnet: Learning and producing voice leading with neural networks and dynamic programming. Journal of New Music Research, 33(4):387–397.
Google Scholar
Huron, D. (1989). Voice denumerability in polyphonic music of homogeneous timbres. Music Perception, 6(4):361–382.
Google Scholar
Huron, D. (1991a). The avoidance of part-crossing in polyphonic music: perceptual evidence and musical practice. Music Perception, 9(1):93–103.
Google Scholar
Huron, D. (1991b). Tonal consonance versus tonal fusion in polyphonic sonorities. Music Perception, 9(2):135–154.
Google Scholar
Huron, D. (2001). Tone and voice: A derivation of the rules of voice-leading from perceptual principles. Music Perception, 19(1):1–64.
Google Scholar
Huron, D. and Fantini, D. (1989). The avoidance of inner-voice entries: Perceptual evidence and musical practice. Music Perception, 7(1):43–48.
Google Scholar
Igel, C. and Hüsken, M. (2003). Empirical evaluation of the improved RPROP learning algorithms. Neurocomputing, 50:105–123.
Google Scholar
Ishigaki, A., Matsubara, M., and Saito, H. (2011). Prioritized contig combining to segregate voices in polyphonic music. In Proceedings of the Sound and Music Computing Conference. Università di Padova.
Google Scholar
Jordanous, A. (2008). Voice separation in polyphonic music: A data-driven approach. In Proceedings of the International Computer Music Conference 2008, Belfast, UK.
Google Scholar
Karydis, I., Nanopoulos, A., Papadopoulos, A., Cambouropoulos, E., and Manolopoulos, Y. (2007). Horizontal and vertical integration/segregation in auditory streaming: a voice separation algorithm for symbolic musical data. In Proceedings of the 4th Sound and Music Computing Conference (SMC ’07), Lefkada, Greece.
Google Scholar
Kilian, J. and Hoos, H. H. (2002). Voice separation—A local optimization approach. In Proceedings of the Third International Conference on Music Information Retrieval (ISMIR 2002), Paris, France.
Google Scholar
Kirlin, P. B. and Utgoff, P. E. (2005). VoiSe: Learning to segregate voices in explicit and implicit polyphony. In Proceedings of the Sixth International Conference on Music Information Retrieval (ISMIR 2005), pages 552–557, London, UK.
Google Scholar
Madsen, S. T. and Widmer, G. (2006). Separating voices in MIDI. In Proceedings of the Seventh International Conference on Music Information Retrieval (ISMIR 2006), pages 57–60, Victoria, Canada.
Google Scholar
Marsden, A. (1992). Modelling the perception of musical voices: a case study in rulebased systems. In Marsden, A. and Pople, A., editors, Computer representations and models in music, pages 239–263. Academic Press.
Google Scholar
McCabe, S. L. and Denham, M. J. (1997). A model of auditory streaming. The Journal of the Acoustical Society of America, 101(3):1611–1621.
Google Scholar
McDermott, J. H. and Oxenham, A. J. (2008). Music perception, pitch, and the auditory system. Current opinion in neurobiology, 18(4):452–463.
Google Scholar
Rafailidis, D., Cambouropoulos, E., and Manolopoulos, Y. (2009). Musical voice integration/segregation: Visa revisited. In Proceedings of the 6th Sound and Music Computing Conference, Porto, Portugal.
Google Scholar
Ragert, M., Fairhurst, M. T., and Keller, P. E. (2014). Segregation and integration of auditory streams when listening to multi-part music. PloS one, 9(1):e84085.
Google Scholar
Riedmiller, M. and Braun, H. (1993). A direct adaptive method for faster backpropagation learning: The RPROP algorithm. In IEEE International Conference on Neural Networks, pages 586–591.
Google Scholar
Shamma, S. A., Elhilali, M., and Micheyl, C. (2011). Temporal coherence and attention in auditory scene analysis. Trends in neurosciences, 34(3):114–123.
Google Scholar
Szeto, W. M. and Wong, M. H. (2006). Stream segregation algorithm for pattern matching in polyphonic music databases. Multimedia Tools and Applications, 30(1):109–127.
Google Scholar
Temperley, D. (2001). The Cognition of Basic Musical Structures. MIT Press.
Google Scholar
Turgeon, M. and Bregman, A. S. (2001). Ambiguous musical figures. Annals of the New York Academy of Sciences, 930(1):375–381.
Google Scholar
Weyde, T. and Dalinghaus, K. (2003). Design and optimization of neuro-fuzzybased recognition of musical rhythm patterns. International Journal of Smart Engineering System Design, 5(2):67–79.
Google Scholar
Wright, J. K. and Bregman, A. S. (1987). Auditory stream segregation and the control of dissonance in polyphonic music. Contemporary Music Review, 2(1):63–92.
Google Scholar
Wrigley, S. N. and Brown, G. J. (2004). A computational model of auditory selective attention. IEEE Transactions on Neural Networks, 15(5):1151–1163.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, City University London, London, UK
Tillman Weyde & Reinier de Valk

Authors

Tillman Weyde
View author publications
You can also search for this author in PubMed Google Scholar
Reinier de Valk
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tillman Weyde .

Editor information

Editors and Affiliations

Dept. of Arch., Design & Media Tech, Aalborg University, Aalborg, Denmark
David Meredith

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Weyde, T., de Valk, R. (2016). Chord- and Note-Based Approaches to Voice Separation. In: Meredith, D. (eds) Computational Music Analysis. Springer, Cham. https://doi.org/10.1007/978-3-319-25931-4_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-25931-4_6
Published: 28 October 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25929-1
Online ISBN: 978-3-319-25931-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics