Abstract
In this paper, we introduce \(\mathsf {PAUL \text {-}2}\), an algorithmic composer for two-track piano pieces of specifiable difficulty levels, as a ground-up redesign of its predecessor system \(\textsf{PAUL}\). While \(\textsf{PAUL}\) was designed using a long short-term memory neural network, along with a sequence-to-sequence network, \(\mathsf {PAUL \text {-}2}\) is based on the state-of-the-art transformer architecture and makes use of relative attention. A shortcoming of \(\textsf{PAUL}\) was that it generated unsatisfying accompanying tracks and allowed for only few difficulty levels. \(\mathsf {PAUL \text {-}2}\) overcomes these limitations and theoretically supports an arbitrary number of difficulty classes due to the fact that it utilises an additional encoder for handling difficulty information. We also carried out a medium-scale survey which showed that the output of PAUL-2 was evaluated quite favourably by the participants.
We would like to thank Wolfgang Schmidtmayr from the University of Music and Performing Arts Vienna and Geraldine Fitzpatrick from the Human-Computer Interaction Group at our university for their valuable input. We also would like to thank the anonymous reviewers for their helpful comments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Sight-reading refers to the act of performing sheet music without having studied it beforehand.
- 2.
\(\textsf{PAUL}\) is named after the well-known Austrian pianist Paul Badura-Skoda (6th October 1927 - 25th September 2019).
References
Aldwell, E., Schachter, C., Cadwallader, A.: Harmony & Voice Leading, 5th edn. Cengage, Boston (2019)
Ba, L.J., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv:1607.06450 (2016)
Bell, C.: Algorithmic music composition using dynamic Markov chains and genetic algorithms. J. Comput. Sci. Coll. 27(2), 99–107 (2011)
Benward, B., Saker, M.: Music in Theory and Practice: Volume 1, 8th edn. McGraw-Hill, New York (2009)
Biles, J.A.: Autonomous GenJam: eliminating the fitness bottleneck by eliminating fitness. In: Proceedings of the 3rd Annual Conference on Genetic and Evolutionary Computation (GECCO 2001) (2001)
Dhariwal, P., Jun, H., Payne, C., Kim, J.W., Radford, A., Sutskever, I.: Jukebox: a generative model for music. arXiv:2005.00341 (2020)
Eck, D., Schmidhuber, J.: Finding temporal structure in music: blues improvisation with LSTM recurrent networks. In: Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing (NNSP 2002), pp. 747–756. IEEE (2002)
Eigenfeldt, A., Pasquier, P.: Realtime generation of harmonic progressions using constrained Markov selection. In: Proceedings of the 1st International Conference on Computational Creativity (ICCC 2010), pp. 16–25. Computationalcreativity.net (2010)
Ferreira, L.N., Lelis, L.H.S., Whitehead, J.: Computer-generated music for tabletop role-playing games. In: Proceedings of the 16th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE 2020), pp. 59–65. AAAI Press (2020)
Hamanaka, M., Hirata, K., Tojo, S.: FATTA: full automatic time-span tree analyzer. In: Proceedings of the 33rd International Computer Music Conference (ICMC 2007). Michigan Publishing (2007)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Huang, C.A., et al.: Music transformer: generating music with long-term structure. In: Proceedings of the 7th International Conference on Learning Representations (ICLR 2019). OpenReview.net (2019)
Kirke, A., Miranda, E.R.: Emergent construction of melodic pitch and hierarchy through agents communicating emotion without melodic intelligence. In: Proceedings of the 37th International Computer Music Conference (ICMC 2011). Michigan Publishing (2011)
Kirke, A., Miranda, E.R.: A multi-agent emotional society whose melodies represent its emergent social hierarchy and are generated by agent communications. J. Artif. Soc. Soc. Simul. 18(2), 16 (2015)
Kitani, K.M., Koike, H.: ImprovGenerator: online grammatical induction for on-the-fly improvisation accompaniment. In: Proceedings of the 10th International Conference on New Interfaces for Musical Expression (NIME 2010), pp. 469–472. Nime.org (2010)
Laitz, S.G.: The Complete Musician: An Integrated Approach To Tonal Theory, Analysis, and Listening, 3rd edn. Oxford University Press, Oxford (2012)
Libovický, J., Helcl, J., Marecek, D.: Input combination strategies for multi-source transformer decoder. In: Proceedings of the 3rd Conference on Machine Translation (WMT 2018), pp. 253–260. Association for Computational Linguistics (2018)
MIDI Manufacturers Association: The Complete MIDI 1.0 Detailed Specification (1996). https://midi.org/
Miranda, E.R.: Cellular automata music: from sound synthesis to musical forms. In: Miranda, E.R., Biles, J.A. (eds.) Evolutionary Computer Music, pp. 170–193. Springer, London (2007). https://doi.org/10.1007/978-1-84628-600-1_8
Opolka, S., Obermeier, P., Schaub, T.: Automatic genre-dependent composition using answer set programming. In: Proceedings of the 21st International Symposium on Electronic Art (ISEA 2015), pp. 627–632. ISEA International, Brighton (2015)
Payne, C.: MuseNet (2019). https://openai.com/research/musenet. Accessed 20 June 2023
Schön, F.: PAUL: an algorithmic composer of two-track piano pieces using recurrent neural networks. Bachelor’s thesis, Technische Universität Wien, Institute of Logic and Computation, E192-03 (2020)
Schön, F.: PAUL-2: a transformer-based algorithmic composer of two-track piano pieces. Diploma thesis, Technische Universität Wien, Institute of Logic and Computation, E192-03 (2023)
Schön, F., Tompits, H.: PAUL: an algorithmic composer for classical piano music supporting multiple complexity levels. In: Marreiros, G., Martins, B., Paiva, A., Ribeiro, B., Sardinha, A. (eds.) EPIA 2022. LNCS, vol. 13566, pp. 415–426. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16474-3_34
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of the 27th Annual Conference on Neural Information Processing Systems (NIPS 2014), pp. 3104–3112 (2014)
Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 30th Annual Conference on Neural Information Processing Systems (NIPS 2017), pp. 5998–6008 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Schön, F., Tompits, H. (2023). PAUL-2: An Upgraded Transformer-Based Redesign of the Algorithmic Composer PAUL. In: Basili, R., Lembo, D., Limongelli, C., Orlandini, A. (eds) AIxIA 2023 – Advances in Artificial Intelligence. AIxIA 2023. Lecture Notes in Computer Science(), vol 14318. Springer, Cham. https://doi.org/10.1007/978-3-031-47546-7_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-47546-7_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47545-0
Online ISBN: 978-3-031-47546-7
eBook Packages: Computer ScienceComputer Science (R0)