Skip to main content

PAUL-2: An Upgraded Transformer-Based Redesign of the Algorithmic Composer PAUL

  • Conference paper
  • First Online:
AIxIA 2023 – Advances in Artificial Intelligence (AIxIA 2023)

Abstract

In this paper, we introduce \(\mathsf {PAUL \text {-}2}\), an algorithmic composer for two-track piano pieces of specifiable difficulty levels, as a ground-up redesign of its predecessor system \(\textsf{PAUL}\). While \(\textsf{PAUL}\) was designed using a long short-term memory neural network, along with a sequence-to-sequence network, \(\mathsf {PAUL \text {-}2}\) is based on the state-of-the-art transformer architecture and makes use of relative attention. A shortcoming of \(\textsf{PAUL}\) was that it generated unsatisfying accompanying tracks and allowed for only few difficulty levels. \(\mathsf {PAUL \text {-}2}\) overcomes these limitations and theoretically supports an arbitrary number of difficulty classes due to the fact that it utilises an additional encoder for handling difficulty information. We also carried out a medium-scale survey which showed that the output of PAUL-2 was evaluated quite favourably by the participants.

We would like to thank Wolfgang Schmidtmayr from the University of Music and Performing Arts Vienna and Geraldine Fitzpatrick from the Human-Computer Interaction Group at our university for their valuable input. We also would like to thank the anonymous reviewers for their helpful comments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Sight-reading refers to the act of performing sheet music without having studied it beforehand.

  2. 2.

    \(\textsf{PAUL}\) is named after the well-known Austrian pianist Paul Badura-Skoda (6th October 1927 - 25th September 2019).

References

  1. Aldwell, E., Schachter, C., Cadwallader, A.: Harmony & Voice Leading, 5th edn. Cengage, Boston (2019)

    Google Scholar 

  2. Ba, L.J., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv:1607.06450 (2016)

  3. Bell, C.: Algorithmic music composition using dynamic Markov chains and genetic algorithms. J. Comput. Sci. Coll. 27(2), 99–107 (2011)

    Google Scholar 

  4. Benward, B., Saker, M.: Music in Theory and Practice: Volume 1, 8th edn. McGraw-Hill, New York (2009)

    Google Scholar 

  5. Biles, J.A.: Autonomous GenJam: eliminating the fitness bottleneck by eliminating fitness. In: Proceedings of the 3rd Annual Conference on Genetic and Evolutionary Computation (GECCO 2001) (2001)

    Google Scholar 

  6. Dhariwal, P., Jun, H., Payne, C., Kim, J.W., Radford, A., Sutskever, I.: Jukebox: a generative model for music. arXiv:2005.00341 (2020)

  7. Eck, D., Schmidhuber, J.: Finding temporal structure in music: blues improvisation with LSTM recurrent networks. In: Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing (NNSP 2002), pp. 747–756. IEEE (2002)

    Google Scholar 

  8. Eigenfeldt, A., Pasquier, P.: Realtime generation of harmonic progressions using constrained Markov selection. In: Proceedings of the 1st International Conference on Computational Creativity (ICCC 2010), pp. 16–25. Computationalcreativity.net (2010)

    Google Scholar 

  9. Ferreira, L.N., Lelis, L.H.S., Whitehead, J.: Computer-generated music for tabletop role-playing games. In: Proceedings of the 16th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE 2020), pp. 59–65. AAAI Press (2020)

    Google Scholar 

  10. Hamanaka, M., Hirata, K., Tojo, S.: FATTA: full automatic time-span tree analyzer. In: Proceedings of the 33rd International Computer Music Conference (ICMC 2007). Michigan Publishing (2007)

    Google Scholar 

  11. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  12. Huang, C.A., et al.: Music transformer: generating music with long-term structure. In: Proceedings of the 7th International Conference on Learning Representations (ICLR 2019). OpenReview.net (2019)

    Google Scholar 

  13. Kirke, A., Miranda, E.R.: Emergent construction of melodic pitch and hierarchy through agents communicating emotion without melodic intelligence. In: Proceedings of the 37th International Computer Music Conference (ICMC 2011). Michigan Publishing (2011)

    Google Scholar 

  14. Kirke, A., Miranda, E.R.: A multi-agent emotional society whose melodies represent its emergent social hierarchy and are generated by agent communications. J. Artif. Soc. Soc. Simul. 18(2), 16 (2015)

    Article  Google Scholar 

  15. Kitani, K.M., Koike, H.: ImprovGenerator: online grammatical induction for on-the-fly improvisation accompaniment. In: Proceedings of the 10th International Conference on New Interfaces for Musical Expression (NIME 2010), pp. 469–472. Nime.org (2010)

    Google Scholar 

  16. Laitz, S.G.: The Complete Musician: An Integrated Approach To Tonal Theory, Analysis, and Listening, 3rd edn. Oxford University Press, Oxford (2012)

    Google Scholar 

  17. Libovický, J., Helcl, J., Marecek, D.: Input combination strategies for multi-source transformer decoder. In: Proceedings of the 3rd Conference on Machine Translation (WMT 2018), pp. 253–260. Association for Computational Linguistics (2018)

    Google Scholar 

  18. MIDI Manufacturers Association: The Complete MIDI 1.0 Detailed Specification (1996). https://midi.org/

  19. Miranda, E.R.: Cellular automata music: from sound synthesis to musical forms. In: Miranda, E.R., Biles, J.A. (eds.) Evolutionary Computer Music, pp. 170–193. Springer, London (2007). https://doi.org/10.1007/978-1-84628-600-1_8

    Chapter  Google Scholar 

  20. Opolka, S., Obermeier, P., Schaub, T.: Automatic genre-dependent composition using answer set programming. In: Proceedings of the 21st International Symposium on Electronic Art (ISEA 2015), pp. 627–632. ISEA International, Brighton (2015)

    Google Scholar 

  21. Payne, C.: MuseNet (2019). https://openai.com/research/musenet. Accessed 20 June 2023

  22. Schön, F.: PAUL: an algorithmic composer of two-track piano pieces using recurrent neural networks. Bachelor’s thesis, Technische Universität Wien, Institute of Logic and Computation, E192-03 (2020)

    Google Scholar 

  23. Schön, F.: PAUL-2: a transformer-based algorithmic composer of two-track piano pieces. Diploma thesis, Technische Universität Wien, Institute of Logic and Computation, E192-03 (2023)

    Google Scholar 

  24. Schön, F., Tompits, H.: PAUL: an algorithmic composer for classical piano music supporting multiple complexity levels. In: Marreiros, G., Martins, B., Paiva, A., Ribeiro, B., Sardinha, A. (eds.) EPIA 2022. LNCS, vol. 13566, pp. 415–426. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16474-3_34

    Chapter  Google Scholar 

  25. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of the 27th Annual Conference on Neural Information Processing Systems (NIPS 2014), pp. 3104–3112 (2014)

    Google Scholar 

  26. Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 30th Annual Conference on Neural Information Processing Systems (NIPS 2017), pp. 5998–6008 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Felix Schön .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Schön, F., Tompits, H. (2023). PAUL-2: An Upgraded Transformer-Based Redesign of the Algorithmic Composer PAUL. In: Basili, R., Lembo, D., Limongelli, C., Orlandini, A. (eds) AIxIA 2023 – Advances in Artificial Intelligence. AIxIA 2023. Lecture Notes in Computer Science(), vol 14318. Springer, Cham. https://doi.org/10.1007/978-3-031-47546-7_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-47546-7_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-47545-0

  • Online ISBN: 978-3-031-47546-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics