Comparing Models for Harmony Prediction in an Interactive Audio Looper

Wallace, Benedikte; Martin, Charles P.

doi:10.1007/978-3-030-16667-0_12

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11453))

Included in the following conference series:

International Conference on Computational Intelligence in Music, Sound, Art and Design (Part of EvoStar)

1612 Accesses
1 Citations

Abstract

Musicians often use tools such as loop-pedals and multitrack recorders to assist in improvisation and songwriting, but these tools generally don’t proactively contribute aspects of the musical performance. In this work, we introduce an interactive audio looper that predicts a loop’s harmony, and constructs an accompaniment automatically using concatenative synthesis. The system uses a machine learning (ML) model for harmony prediction, that is, it generates a sequence of chords symbols for a given melody. We analyse the performance of two potential ML models for this task: a hidden Markov model (HMM) and a recurrent neural network (RNN) with bidirectional long short-term memory (BLSTM) cells. Our findings show that the RNN approach provides more accurate predictions and is more robust with respect to changes in the training data. We consider the impact of each model’s predictions in live performance and ask: “What is an accurate chord prediction anyway?”

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This research is an extension of the author’s master’s thesis [24].
2.
Performance with the PSCA Looper (Direct download): https://goo.gl/59kVko.

References

Bahl, L.R., Jelinek, F., Mercer, R.L.: A maximum likelihood approach to continuous speech recognition. In: Readings in Speech Recognition, pp. 308–319. Elsevier (1990)
Google Scholar
Brunner, G., Wang, Y., Wattenhofer, R., Wiesendanger, J.: Jambot: music theory aware chord based generation of polyphonic music with LSTMs. In: 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 519–526. IEEE (2017). https://doi.org/10.1109/ICTAI.2017.00085
Burda, Y., Edwards, H., Pathak, D., Storkey, A., Darrell, T., Efros, A.A.: Large-scale study of curiosity-driven learning. In: Proceedings of the International Conference on Learning Representations (ICLR) (2019). https://arxiv.org/abs/1808.04355
Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measur. 20(1), 37–46 (1960). https://doi.org/10.1177/001316446002000104
Article Google Scholar
Cuthbert, M.S., Ariza, C.: music21: a toolkit for computer-aided musicology and symbolic music data. In: Downie, J.S., Veltkamp, R.C. (eds.) Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR 2010), pp. 637–642. International Society for Music Information Retrieval, Utrecht (2010)
Google Scholar
Eck, D., Schmidhuber, J.: Finding temporal structure in music: blues improvisation with LSTM recurrent networks. In: Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing, pp. 747–756. IEEE (2002). https://doi.org/10.1109/NNSP.2002.1030094
Fan, W., Stolfo, S.J., Zhang, J., Chan, P.K.: Adacost: misclassification cost-sensitive boosting. In: Proceedings of the Sixteenth International Conference on Machine Learning, ICML 1999, vol. 99, pp. 97–105 (1999)
Google Scholar
Forney, G.D.: The viterbi algorithm. Proc. IEEE 61(3), 268–278 (1973)
Article MathSciNet Google Scholar
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 5(18), 602–610 (2005)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lehman, J., Stanley, K.O.: Abandoning objectives: evolution through the search for novelty alone. Evol. Comput. 19(2), 189–223 (2011)
Article Google Scholar
Lim, H., Rhyu, S., Lee, K.: Chord generation from symbolic melody using BLSTM networks. In: 18th International Society for Music Information Retrieval Conference (2017)
Google Scholar
Martin, C.P., Ellefsen, K.O., Torresen, J.: Deep predictive models in interactive music. arXiv e-prints, January 2018. https://arxiv.org/abs/1801.10492
Martin, C.P., Torresen, J.: RoboJam: a musical mixture density network for collaborative touchscreen interaction. In: Liapis, A., Romero Cardalda, J.J., Ekárt, A. (eds.) EvoMUSART 2018. LNCS, vol. 10783, pp. 161–176. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77583-8_11
Chapter Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Pachet, F., Roy, P., Moreira, J., d’Inverno, M.: Reflexive loopers for solo musical improvisation. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2013, pp. 2205–2208. ACM, New York (2013). https://doi.org/10.1145/2470654.2481303
Rabiner, L., Juang, B.: An introduction to hidden Markov models. IEEE ASSP Mag. 3(1), 4–16 (1986). https://doi.org/10.1109/MASSP.1986.1165342
Article Google Scholar
Raczyński, S.A., Fukayama, S., Vincent, E.: Melody harmonization with interpolated probabilistic models. J. New Music Res. 42(3), 223–235 (2013)
Article Google Scholar
Schmidhuber, J.: Developmental robotics, optimal artificial curiosity, creativity, music, and the fine arts. Connection Sci. 18(2), 173–187 (2006)
Article Google Scholar
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Sig. Process. 45(11), 2673–2681 (1997)
Article Google Scholar
Simon, I., Morris, D., Basu, S.: Mysong: automatic accompaniment generation for vocal melodies. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2008, pp. 725–734. ACM, New York (2008). https://doi.org/10.1145/1357054.1357169
Tokuda, K., Zen, H., Black, A.W.: An HMM-based speech synthesis system applied to English. In: IEEE Speech Synthesis Workshop, pp. 227–230 (2002)
Google Scholar
Wallace, B.: Predictive songwriting with concatenative accompaniment. Master’s thesis, Department of Informatics, University of Oslo (2018)
Google Scholar

Download references

Acknowledgment

This work was supported by The Research Council of Norway as a part of the Engineering Predictability with Embodied Cognition (EPEC) project, under grant agreement 240862 and through its Centres of Excellence scheme, project number 262762.

Author information

Authors and Affiliations

RITMO Centre for Interdisciplinary Studies in Rhythm, Time, and Motion, Department of Informatics, University of Oslo, Oslo, Norway
Benedikte Wallace & Charles P. Martin

Authors

Benedikte Wallace
View author publications
You can also search for this author in PubMed Google Scholar
Charles P. Martin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Benedikte Wallace .

Editor information

Editors and Affiliations

Aston University, Birmingham, UK
Anikó Ekárt
University of Malta, Msida, Malta
Antonios Liapis
University of A Coruña, A Coruña, Spain
María Luz Castro Pena

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wallace, B., Martin, C.P. (2019). Comparing Models for Harmony Prediction in an Interactive Audio Looper. In: Ekárt, A., Liapis, A., Castro Pena, M.L. (eds) Computational Intelligence in Music, Sound, Art and Design. EvoMUSART 2019. Lecture Notes in Computer Science(), vol 11453. Springer, Cham. https://doi.org/10.1007/978-3-030-16667-0_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-16667-0_12
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16666-3
Online ISBN: 978-3-030-16667-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics