skip to main content
10.1145/3511047.3536412acmconferencesArticle/Chapter ViewAbstractPublication PagesumapConference Proceedingsconference-collections
extended-abstract

ReStyle-MusicVAE: Enhancing User Control of Deep Generative Music Models with Expert Labeled Anchors

Published: 04 July 2022 Publication History

Abstract

Deep generative models have emerged as one of the most actively researched topics in artificial intelligence. An area that draws increasing attention is the automatic generation of music, with various applications including systems that support and inspire the process of music composition. For these assistive systems, in order to be successful and accepted by users, it is imperative to give the user agency and express their personal style in the process of composition.
In this paper, we demonstrate ReStyle-MusicVAE, a system for human-AI co-creation in music composition. More specifically, ReStyle-MusicVAE combines the automatic melody generation and variation approach of MusicVAE and adds semantic control dimensions to further steer the process. To this end, expert-annotated melody lines created for music production are used to define stylistic anchors, which serve as semantic references for interpolation. We present an easy-to-use web app built on top of the Magenta.js JavaScript library and pre-trained MusicVAE checkpoints.

Supplementary Material

MP4 File (UMAP22_ReStyle-MusicVAE_EnhancingUserControlOfDeepGenerativeMusicModelsWithExpertLabeledAnchors.mp4)
Video Presentation
MP4 File (UMAP22_ReStyle-MusicVAE_EnhancingUserControlOfDeepGenerativeMusicModelsWithExpertLabeledAnchors.mp4)
Video Presentation

References

[1]
Jean-Julien Aucouturier and François Pachet. 2003. Representing Musical Genre: A State of the Art. Journal of New Music Research 32, 1 (2003), 83–93. https://doi.org/10.1076/jnmr.32.1.83.16801 arXiv:https://www.tandfonline.com/doi/pdf/10.1076/jnmr.32.1.83.16801
[2]
Jean-Pierre Briot, Gaëtan Hadjeres, and François-David Pachet. 2020. Deep Learning Techniques for Music Generation. Springer International Publishing, Cham. https://doi.org/10.1007/978-3-319-70163-9_1
[3]
Filippo Carnovalini and Antonio Rodà. 2020. Computational Creativity and Music Generation Systems: An Introduction to the State of the Art. Frontiers in Artificial Intelligence 3 (2020). https://doi.org/10.3389/frai.2020.00014
[4]
Prafulla Dhariwal, Heewoo Jun, Christine Payne, Jong Wook Kim, Alec Radford, and Ilya Sutskever. 2020. Jukebox: A Generative Model for Music. arXiv preprint arXiv:2005.00341(2020). https://doi.org/10.48550/ARXIV.2005.00341
[5]
Monica Dinculescu, Jesse Engel, and Adam Roberts. 2019. MidiMe: Personalizing a MusicVAE model with user data. In Workshop on Machine Learning for Creativity and Design, NeurIPS.
[6]
Florian Grote, Kristina Andersen, and Peter Knees. 2015. Collaborating with Intelligent Machines: Interfaces for Creative Sound. In Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems (Seoul, Republic of Korea) (CHI EA ’15). Association for Computing Machinery, New York, NY, USA, 2345–2348. https://doi.org/10.1145/2702613.2702650
[7]
Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, Ian Simon, Curtis Hawthorne, Andrew M. Dai, Matthew D. Hoffman, Monica Dinculescu, and Douglas Eck. 2018. Music Transformer. arXiv preprint arXiv:1809.04281(2018). https://doi.org/10.48550/ARXIV.1809.04281
[8]
Omer Levy and Yoav Goldberg. 2014. Dependency-Based Word Embeddings. In 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014 - Proceedings of the Conference, Vol. 2. 302–308. https://doi.org/10.3115/v1/P14-2050
[9]
Ryan Louie, Andy Coenen, Cheng Zhi Huang, Michael Terry, and Carrie J Cai. 2020. Novice-AI music co-creation via AI-steering tools for deep generative models. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3313831.3376739
[10]
Ryan Louie, Jesse Engel, and Anna Huang. 2021. Expressive Communication: A Common Framework for Evaluating Developments in Generative Models and Steering Interfaces. arxiv:2111.14951 [cs.HC] https://arxiv.org/abs/2111.14951
[11]
Adam Roberts, Jesse Engel, Colin Raffel, Curtis Hawthorne, and Douglas Eck. 2018. A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music. In Proceedings of the 35th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 80), Jennifer Dyand Andreas Krause (Eds.). PMLR, 4364–4373. https://proceedings.mlr.press/v80/roberts18a.html
[12]
Minhyang Suh, Emily Youngblom, Michael Terry, and Carrie J Cai. 2021. AI as Social Glue: Uncovering the Roles of Deep Generative AI during Social Music Composition. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3411764.3445219
[13]
Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE.Journal of machine learning research 9, 11 (2008).
[14]
Òscar Celma, Perfecto Herrera, and Xavier Serra. 2006. Bridging the Music Semantic Gap. In ESWC 2006 Workshop on Mastering the Gap: From Information Extraction to Semantic Representation. http://hdl.handle.net/10230/34294

Cited By

View all
  • (2024)And Justice for Art(ists): Metaphorical Design as a Method for Creating Culturally Diverse Human-AI Music Composition Experiences2024 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)10.1109/HORA61326.2024.10550680(1-4)Online publication date: 23-May-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
UMAP '22 Adjunct: Adjunct Proceedings of the 30th ACM Conference on User Modeling, Adaptation and Personalization
July 2022
409 pages
ISBN:9781450392327
DOI:10.1145/3511047
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 July 2022

Check for updates

Author Tags

  1. music generation
  2. user control
  3. variational auto encoder

Qualifiers

  • Extended-abstract
  • Research
  • Refereed limited

Conference

UMAP '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 162 of 633 submissions, 26%

Upcoming Conference

UMAP '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)53
  • Downloads (Last 6 weeks)5
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)And Justice for Art(ists): Metaphorical Design as a Method for Creating Culturally Diverse Human-AI Music Composition Experiences2024 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)10.1109/HORA61326.2024.10550680(1-4)Online publication date: 23-May-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media