Abstract
Musicians often use tools such as loop-pedals and multitrack recorders to assist in improvisation and songwriting, but these tools generally don’t proactively contribute aspects of the musical performance. In this work, we introduce an interactive audio looper that predicts a loop’s harmony, and constructs an accompaniment automatically using concatenative synthesis. The system uses a machine learning (ML) model for harmony prediction, that is, it generates a sequence of chords symbols for a given melody. We analyse the performance of two potential ML models for this task: a hidden Markov model (HMM) and a recurrent neural network (RNN) with bidirectional long short-term memory (BLSTM) cells. Our findings show that the RNN approach provides more accurate predictions and is more robust with respect to changes in the training data. We consider the impact of each model’s predictions in live performance and ask: “What is an accurate chord prediction anyway?”
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
This research is an extension of the author’s master’s thesis [24].
- 2.
Performance with the PSCA Looper (Direct download): https://goo.gl/59kVko.
References
Bahl, L.R., Jelinek, F., Mercer, R.L.: A maximum likelihood approach to continuous speech recognition. In: Readings in Speech Recognition, pp. 308–319. Elsevier (1990)
Brunner, G., Wang, Y., Wattenhofer, R., Wiesendanger, J.: Jambot: music theory aware chord based generation of polyphonic music with LSTMs. In: 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 519–526. IEEE (2017). https://doi.org/10.1109/ICTAI.2017.00085
Burda, Y., Edwards, H., Pathak, D., Storkey, A., Darrell, T., Efros, A.A.: Large-scale study of curiosity-driven learning. In: Proceedings of the International Conference on Learning Representations (ICLR) (2019). https://arxiv.org/abs/1808.04355
Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measur. 20(1), 37–46 (1960). https://doi.org/10.1177/001316446002000104
Cuthbert, M.S., Ariza, C.: music21: a toolkit for computer-aided musicology and symbolic music data. In: Downie, J.S., Veltkamp, R.C. (eds.) Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR 2010), pp. 637–642. International Society for Music Information Retrieval, Utrecht (2010)
Eck, D., Schmidhuber, J.: Finding temporal structure in music: blues improvisation with LSTM recurrent networks. In: Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing, pp. 747–756. IEEE (2002). https://doi.org/10.1109/NNSP.2002.1030094
Fan, W., Stolfo, S.J., Zhang, J., Chan, P.K.: Adacost: misclassification cost-sensitive boosting. In: Proceedings of the Sixteenth International Conference on Machine Learning, ICML 1999, vol. 99, pp. 97–105 (1999)
Forney, G.D.: The viterbi algorithm. Proc. IEEE 61(3), 268–278 (1973)
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 5(18), 602–610 (2005)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lehman, J., Stanley, K.O.: Abandoning objectives: evolution through the search for novelty alone. Evol. Comput. 19(2), 189–223 (2011)
Lim, H., Rhyu, S., Lee, K.: Chord generation from symbolic melody using BLSTM networks. In: 18th International Society for Music Information Retrieval Conference (2017)
Martin, C.P., Ellefsen, K.O., Torresen, J.: Deep predictive models in interactive music. arXiv e-prints, January 2018. https://arxiv.org/abs/1801.10492
Martin, C.P., Torresen, J.: RoboJam: a musical mixture density network for collaborative touchscreen interaction. In: Liapis, A., Romero Cardalda, J.J., Ekárt, A. (eds.) EvoMUSART 2018. LNCS, vol. 10783, pp. 161–176. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77583-8_11
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Pachet, F., Roy, P., Moreira, J., d’Inverno, M.: Reflexive loopers for solo musical improvisation. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2013, pp. 2205–2208. ACM, New York (2013). https://doi.org/10.1145/2470654.2481303
Rabiner, L., Juang, B.: An introduction to hidden Markov models. IEEE ASSP Mag. 3(1), 4–16 (1986). https://doi.org/10.1109/MASSP.1986.1165342
Raczyński, S.A., Fukayama, S., Vincent, E.: Melody harmonization with interpolated probabilistic models. J. New Music Res. 42(3), 223–235 (2013)
Schmidhuber, J.: Developmental robotics, optimal artificial curiosity, creativity, music, and the fine arts. Connection Sci. 18(2), 173–187 (2006)
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Sig. Process. 45(11), 2673–2681 (1997)
Simon, I., Morris, D., Basu, S.: Mysong: automatic accompaniment generation for vocal melodies. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2008, pp. 725–734. ACM, New York (2008). https://doi.org/10.1145/1357054.1357169
Tokuda, K., Zen, H., Black, A.W.: An HMM-based speech synthesis system applied to English. In: IEEE Speech Synthesis Workshop, pp. 227–230 (2002)
Wallace, B.: Predictive songwriting with concatenative accompaniment. Master’s thesis, Department of Informatics, University of Oslo (2018)
Acknowledgment
This work was supported by The Research Council of Norway as a part of the Engineering Predictability with Embodied Cognition (EPEC) project, under grant agreement 240862 and through its Centres of Excellence scheme, project number 262762.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Wallace, B., Martin, C.P. (2019). Comparing Models for Harmony Prediction in an Interactive Audio Looper. In: Ekárt, A., Liapis, A., Castro Pena, M.L. (eds) Computational Intelligence in Music, Sound, Art and Design. EvoMUSART 2019. Lecture Notes in Computer Science(), vol 11453. Springer, Cham. https://doi.org/10.1007/978-3-030-16667-0_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-16667-0_12
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16666-3
Online ISBN: 978-3-030-16667-0
eBook Packages: Computer ScienceComputer Science (R0)