Skip to main content

Comparing Models for Harmony Prediction in an Interactive Audio Looper

  • Conference paper
Computational Intelligence in Music, Sound, Art and Design (EvoMUSART 2019)

Abstract

Musicians often use tools such as loop-pedals and multitrack recorders to assist in improvisation and songwriting, but these tools generally don’t proactively contribute aspects of the musical performance. In this work, we introduce an interactive audio looper that predicts a loop’s harmony, and constructs an accompaniment automatically using concatenative synthesis. The system uses a machine learning (ML) model for harmony prediction, that is, it generates a sequence of chords symbols for a given melody. We analyse the performance of two potential ML models for this task: a hidden Markov model (HMM) and a recurrent neural network (RNN) with bidirectional long short-term memory (BLSTM) cells. Our findings show that the RNN approach provides more accurate predictions and is more robust with respect to changes in the training data. We consider the impact of each model’s predictions in live performance and ask: “What is an accurate chord prediction anyway?”

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This research is an extension of the author’s master’s thesis [24].

  2. 2.

    Performance with the PSCA Looper (Direct download): https://goo.gl/59kVko.

References

  1. Bahl, L.R., Jelinek, F., Mercer, R.L.: A maximum likelihood approach to continuous speech recognition. In: Readings in Speech Recognition, pp. 308–319. Elsevier (1990)

    Google Scholar 

  2. Brunner, G., Wang, Y., Wattenhofer, R., Wiesendanger, J.: Jambot: music theory aware chord based generation of polyphonic music with LSTMs. In: 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 519–526. IEEE (2017). https://doi.org/10.1109/ICTAI.2017.00085

  3. Burda, Y., Edwards, H., Pathak, D., Storkey, A., Darrell, T., Efros, A.A.: Large-scale study of curiosity-driven learning. In: Proceedings of the International Conference on Learning Representations (ICLR) (2019). https://arxiv.org/abs/1808.04355

  4. Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measur. 20(1), 37–46 (1960). https://doi.org/10.1177/001316446002000104

    Article  Google Scholar 

  5. Cuthbert, M.S., Ariza, C.: music21: a toolkit for computer-aided musicology and symbolic music data. In: Downie, J.S., Veltkamp, R.C. (eds.) Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR 2010), pp. 637–642. International Society for Music Information Retrieval, Utrecht (2010)

    Google Scholar 

  6. Eck, D., Schmidhuber, J.: Finding temporal structure in music: blues improvisation with LSTM recurrent networks. In: Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing, pp. 747–756. IEEE (2002). https://doi.org/10.1109/NNSP.2002.1030094

  7. Fan, W., Stolfo, S.J., Zhang, J., Chan, P.K.: Adacost: misclassification cost-sensitive boosting. In: Proceedings of the Sixteenth International Conference on Machine Learning, ICML 1999, vol. 99, pp. 97–105 (1999)

    Google Scholar 

  8. Forney, G.D.: The viterbi algorithm. Proc. IEEE 61(3), 268–278 (1973)

    Article  MathSciNet  Google Scholar 

  9. Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 5(18), 602–610 (2005)

    Article  Google Scholar 

  10. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  11. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  12. Lehman, J., Stanley, K.O.: Abandoning objectives: evolution through the search for novelty alone. Evol. Comput. 19(2), 189–223 (2011)

    Article  Google Scholar 

  13. Lim, H., Rhyu, S., Lee, K.: Chord generation from symbolic melody using BLSTM networks. In: 18th International Society for Music Information Retrieval Conference (2017)

    Google Scholar 

  14. Martin, C.P., Ellefsen, K.O., Torresen, J.: Deep predictive models in interactive music. arXiv e-prints, January 2018. https://arxiv.org/abs/1801.10492

  15. Martin, C.P., Torresen, J.: RoboJam: a musical mixture density network for collaborative touchscreen interaction. In: Liapis, A., Romero Cardalda, J.J., Ekárt, A. (eds.) EvoMUSART 2018. LNCS, vol. 10783, pp. 161–176. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77583-8_11

    Chapter  Google Scholar 

  16. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  17. Pachet, F., Roy, P., Moreira, J., d’Inverno, M.: Reflexive loopers for solo musical improvisation. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2013, pp. 2205–2208. ACM, New York (2013). https://doi.org/10.1145/2470654.2481303

  18. Rabiner, L., Juang, B.: An introduction to hidden Markov models. IEEE ASSP Mag. 3(1), 4–16 (1986). https://doi.org/10.1109/MASSP.1986.1165342

    Article  Google Scholar 

  19. Raczyński, S.A., Fukayama, S., Vincent, E.: Melody harmonization with interpolated probabilistic models. J. New Music Res. 42(3), 223–235 (2013)

    Article  Google Scholar 

  20. Schmidhuber, J.: Developmental robotics, optimal artificial curiosity, creativity, music, and the fine arts. Connection Sci. 18(2), 173–187 (2006)

    Article  Google Scholar 

  21. Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Sig. Process. 45(11), 2673–2681 (1997)

    Article  Google Scholar 

  22. Simon, I., Morris, D., Basu, S.: Mysong: automatic accompaniment generation for vocal melodies. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2008, pp. 725–734. ACM, New York (2008). https://doi.org/10.1145/1357054.1357169

  23. Tokuda, K., Zen, H., Black, A.W.: An HMM-based speech synthesis system applied to English. In: IEEE Speech Synthesis Workshop, pp. 227–230 (2002)

    Google Scholar 

  24. Wallace, B.: Predictive songwriting with concatenative accompaniment. Master’s thesis, Department of Informatics, University of Oslo (2018)

    Google Scholar 

Download references

Acknowledgment

This work was supported by The Research Council of Norway as a part of the Engineering Predictability with Embodied Cognition (EPEC) project, under grant agreement 240862 and through its Centres of Excellence scheme, project number 262762.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Benedikte Wallace .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Cite this paper

Wallace, B., Martin, C.P. (2019). Comparing Models for Harmony Prediction in an Interactive Audio Looper. In: Ekárt, A., Liapis, A., Castro Pena, M.L. (eds) Computational Intelligence in Music, Sound, Art and Design. EvoMUSART 2019. Lecture Notes in Computer Science(), vol 11453. Springer, Cham. https://doi.org/10.1007/978-3-030-16667-0_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-16667-0_12

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-16666-3

  • Online ISBN: 978-3-030-16667-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics