ABSTRACT
Recent content creation systems allow users to generate various high-quality content (e.g., images, 3D models, and melodies) by just specifying a parameter set (e.g., a latent vector of a deep generative model). The task here is to search for an appropriate parameter set that produces the desired content. To facilitate this task execution, researchers have investigated user-in-the-loop optimization, where the system samples candidate solutions, asks the user to provide preferential feedback on them, and iterates this procedure until finding the desired solution. In this work, we investigate a novel approach to enhance this interactive process: allowing users to control the sampling behavior. More specifically, we allow users to adjust the balance between exploration (i.e., favoring diverse samples) and exploitation (i.e., favoring focused samples) in each iteration. To evaluate how this approach affects the user experience and optimization behavior, we implement it into a melody composition system that combines a deep generative model with Bayesian optimization. Our experiments suggest that this approach could improve the user’s engagement and optimization performance.
- Eric Brochu, Tyson Brochu, and Nando de Freitas. 2010. A Bayesian Interactive Optimization Approach to Procedural Animation Design. In Proc. SCA ’10. 103–112. https://doi.org/10.2312/SCA/SCA10/103-112Google ScholarCross Ref
- Eric Brochu, Nando de Freitas, and Abhijeet Ghosh. 2007. Active Preference Learning with Discrete Choice Data. In Proc. NIPS ’07. 409–416. http://papers.nips.cc/paper/3219-active-preference-learning-with-discrete-choice-dataGoogle Scholar
- Zhiqin Chen and Hao Zhang. 2019. Learning implicit fields for generative shape modeling. In Proc. CVPR 2019. 5939–5948. https://doi.org/10.1109/CVPR.2019.00609Google ScholarCross Ref
- Erin Cherry and Celine Latulipe. 2014. Quantifying the Creativity Support of Digital Tools through the Creativity Support Index. ACM Trans. Comput.-Hum. Interact. 21, 4, Article 21 (June 2014), 25 pages. https://doi.org/10.1145/2617588Google ScholarDigital Library
- Chia-Hsing Chiu, Yuki Koyama, Yu-Chi Lai, Takeo Igarashi, and Yonghao Yue. 2020. Human-in-the-Loop Differential Subspace Search in High-Dimensional Latent Space. ACM Trans. Graph. 39, 4 (2020), 85:1–85:15. https://doi.org/10.1145/3386569.3392409Google ScholarDigital Library
- Toby Chong, I-Chao Shen, Issei Sato, and Takeo Igarashi. 2021. Interactive Optimization of Generative Image Modelling using Sequential Subspace Search and Content‐based Guidance. Comput. Graph. Forum(2021). https://doi.org/10.1111/cgf.14188Google ScholarCross Ref
- Marco Cuturi and Mathieu Blondel. 2017. Soft-DTW: a Differentiable Loss Function for Time-Series. In Proc. ICML ’17. 894–903. http://proceedings.mlr.press/v70/cuturi17a.htmlGoogle Scholar
- Nicholas Davis, Chih-PIn Hsiao, Kunwar Yashraj Singh, Lisa Li, and Brian Magerko. 2016. Empirically Studying Participatory Sense-Making in Abstract Drawing with a Co-Creative Cognitive Agent. In Proc. IUI ’16. 196–207. https://doi.org/10.1145/2856767.2856795Google ScholarDigital Library
- Monica Dinculescu, Jesse Engel, and Adam Roberts. 2019. MidiMe: Personalizing a MusicVAE model with user data. In Proc. NeurIPS Workshop on Machine Learning for Creativity and Design ’19.Google Scholar
- Morwaread M. Farbood, Egon Pasztor, and Kevin Jennings. 2004. Hyperscore: A Graphical Sketchpad for Novice Composers. IEEE Comput. Graph. Appl. 24, 1 (Jan. 2004), 50–54. https://doi.org/10.1109/MCG.2004.1255809Google ScholarDigital Library
- Jonas Frich, Lindsay MacDonald Vermeulen, Christian Remy, Michael Mose Biskjaer, and Peter Dalsgaard. 2019. Mapping the Landscape of Creativity Support Tools in HCI. In Proc. CHI ’19. 389:1–389:18. https://doi.org/10.1145/3290605.3300619Google ScholarDigital Library
- Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2018. Progressive Growing of GANs for Improved Quality, Stability, and Variation. In Proc. ICLR 2018. https://openreview.net/forum?id=Hk99zCeAbGoogle Scholar
- Diederik P. Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In Proc. ICLR ’14. http://arxiv.org/abs/1312.6114Google Scholar
- Janin Koch, Andrés Lucero, Lena Hegemann, and Antti Oulasvirta. 2019. May AI? Design Ideation with Cooperative Contextual Bandits. In Proc. CHI ’19. 633:1–633:12. https://doi.org/10.1145/3290605.3300863Google ScholarDigital Library
- Yuki Koyama, Issei Sato, and Masataka Goto. 2020. Sequential Gallery for Interactive Visual Design Optimization. ACM Trans. Graph. 39, 4 (July 2020), 88:1–88:12. https://doi.org/10.1145/3386569.3392444Google ScholarDigital Library
- Yuki Koyama, Issei Sato, Daisuke Sakamoto, and Takeo Igarashi. 2017. Sequential Line Search for Efficient Visual Design Optimization by Crowds. ACM Trans. Graph. 36, 4 (July 2017), 48:1–48:11. https://doi.org/10.1145/3072959.3073598Google ScholarDigital Library
- Ryan Louie, Andy Coenen, Cheng Zhi Huang, Michael Terry, and Carrie J. Cai. 2020. Novice-AI Music Co-Creation via AI-Steering Tools for Deep Generative Models. In Proc. CHI ’20. 1–13. https://doi.org/10.1145/3313831.3376739Google ScholarDigital Library
- Charles Martin, Henry Gardner, Ben Swift, and Michael Martin. 2016. Intelligent Agents and Networked Buttons Improve Free-Improvised Ensemble Music-Making on Touch-Screens. In Proc. CHI ’16. 2295–2306. https://doi.org/10.1145/2858036.2858269Google ScholarDigital Library
- Nolwenn Maudet. 2019. Dead Angles of Personalization: Integrating Curation Algorithms in the Fabric of Design. In Proc. DIS ’19. 1439–1448. https://doi.org/10.1145/3322276.3322322Google ScholarDigital Library
- Adam Roberts, Jesse Engel, Colin Raffel, Curtis Hawthorne, and Douglas Eck. 2018. A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music. In Proc. ICML ’18. 4364–4373.Google Scholar
- Matthias Schonlau, William J. Welch, and Donald R. Jones. 1998. Global versus local search in constrained optimization of computer models. Lecture Notes–Monograph Series, Vol. 34. Institute of Mathematical Statistics, Hayward, CA, 11–25. https://doi.org/10.1214/lnms/1215456182Google ScholarCross Ref
- Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P. Adams, and Nando de Freitas. 2016. Taking the Human Out of the Loop: A Review of Bayesian Optimization. Proc. IEEE 104, 1 (January 2016), 148–175. https://doi.org/10.1109/JPROC.2015.2494218Google ScholarCross Ref
- Ben Shneiderman. 2007. Creativity Support Tools: Accelerating Discovery and Innovation. Commun. ACM 50, 12 (Dec. 2007), 20–32. https://doi.org/10.1145/1323688.1323689Google ScholarDigital Library
- Niranjan Srinivas, Andreas Krause, Sham Kakade, and Matthias Seeger. 2012. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting. IEEE Trans. Inf. Theory 58, 5 (May 2012), 3250–3265. https://doi.org/10.1109/TIT.2011.2182033Google ScholarDigital Library
- Hideyuki Takagi. 2001. Interactive Evolutionary Computation: Fusion of the Capabilities of EC Optimization and Human Evaluation. Proc. IEEE 89, 9 (Sep. 2001), 1275–1296. https://doi.org/10.1109/5.949485Google ScholarCross Ref
- Jerry O. Talton, Daniel Gibson, Lingfeng Yang, Pat Hanrahan, and Vladlen Koltun. 2009. Exploratory Modeling with Collaborative Design Spaces. ACM Trans. Graph. 28, 5 (Dec. 2009), 167:1–167:10. https://doi.org/10.1145/1618452.1618513Google ScholarDigital Library
- Betty Anne Younker and William H. Smith. 1996. Comparing and Modeling Musical Thought Processes of Expert and Novice Composers. Bulletin of the Council for Research in Music Education128 (1996), 25–36. http://www.jstor.org/stable/40318786Google Scholar
- Yijun Zhou, Yuki Koyama, Masataka Goto, and Takeo Igarashi. 2020. Generative Melody Composition with Human-in-the-Loop Bayesian Optimization. In Proceedings of the 2020 Joint Conference on AI Music Creativity(CSMC-MuMe 2020). https://arxiv.org/abs/2010.03190Google Scholar
Index Terms
- Interactive Exploration-Exploitation Balancing for Generative Melody Composition
Recommendations
Generative tools for interactive composition: Real-time musical structures based on schaeffer's tartyp and on klumpenhouwer networks
Interactive computer music is comparable to improvisation because it includes elements of real-time composition performed by the computer. This process of real-time composition often incorporates stochastic techniques that remap a predetermined ...
Neural Melody Composition from Lyrics
Natural Language Processing and Chinese ComputingAbstractIn this paper, we study a novel task that learns to compose music from natural language. Given the lyrics as input, we propose a melody composition model that generates lyrics-conditional melody as well as the exact alignment between the ...
Modified clustering-based differential evolution with a flexible combination of exploration and exploitation
Differential evolution (DE) has been extensively used in optimization problem. However, original DE has some shortcomings. Up to now, there have been a lot of its variations. In this paper, a modified version of differential evolution algorithm is ...
Comments