ABSTRACT
Extensive research has been conducted on using procedural music generation in real-time applications such as accompaniment to musicians, visual narratives, and games. However, less attention has been paid to the enhancement of textual narratives through music. In this paper, we present Mood Into Note Using Extracted Text (MINUET), a novel system that can procedurally generate music for textual narrative segments using sentiment analysis. Textual analysis of the flow and sentiment derived from the text is used as input to condition accompanying music. Music generation systems have addressed variations through changes in sentiment. By using an ensemble predictor model to classify sentences as belonging to particular emotions, MINUET generates text-accompanying music with the goal of enhancing a reader’s experience beyond the limits of the author’s words. Music is played via the JMusic library and a set of Markov chains specific to each emotion with mood classifications evaluated via stratified 10-fold cross validation. The development of MINUET affords the reflection and analysis of features that affect the quality of generated musical accompaniment for text. It also serves as a sandbox for further evaluating sentiment-based systems on both text and music generation sides in a coherent experience of an implemented and extendable experiential artifact.
- Timothey Adam, Michael Haungs, and Foaad Khosmood. 2014. Procedurally generated, adaptive music for rapid game development. In FDG 2014 Workshop Proceedings, Foundation of Digital Games.Google Scholar
- Cecilia Ovesdotter Alm, Dan Roth, and Richard Sproat. 2005. Emotions from text: machine learning for text-based emotion prediction. In Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, 579–586.Google ScholarDigital Library
- Heike Argstatter. 2016. Perception of basic emotions in music: Culture-specific or multicultural?Psychology of Music 44, 4 (2016), 674–690.Google Scholar
- Laura-Lee Balkwill and William Forde Thompson. 1999. A cross-cultural investigation of the perception of emotion in music: Psychophysical and cultural cues. Music perception: an interdisciplinary journal 17, 1 (1999), 43–64.Google Scholar
- Karen Collins. 2009. An introduction to procedural music in video games. Contemporary Music Review 28, 1 (2009), 5–15.Google ScholarCross Ref
- Kate Compton, Ben Kybartas, and Michael Mateas. 2015. Tracery: an author-focused generative text tool. In International Conference on Interactive Digital Storytelling. Springer, 154–161.Google ScholarCross Ref
- Kate Compton and Michael Mateas. 2015. Casual Creators.. In ICCC. 228–235.Google Scholar
- Kim G Dolgin and Edward H Adelson. 1990. Age changes in the ability to interpret affect in sung and instrumentally-presented melodies. Psychology of Music 18, 1 (1990), 87–98.Google ScholarCross Ref
- Chris Donahue, Julian McAuley, and Miller Puckette. 2018. Synthesizing Audio with Generative Adversarial Networks. arXiv preprint arXiv:1802.04208(2018).Google Scholar
- Hao-Wen Dong, Wen-Yi Hsiao, Li-Chia Yang, and Yi-Hsuan Yang. 2018. MuseGAN: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment. In Proc. AAAI Conf. Artificial Intelligence.Google ScholarCross Ref
- Cicero dos Santos and Maira Gatti. 2014. Deep convolutional neural networks for sentiment analysis of short texts. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 69–78.Google Scholar
- Richard Evans and Emily Short. 2013. Versu—a simulationist storytelling system. IEEE Transactions on Computational Intelligence and AI in Games 6, 2(2013), 113–130.Google ScholarCross Ref
- John Fuegi and Jo Francis. 2003. Lovelace & Babbage and the creation of the 1843’notes’. IEEE Annals of the History of Computing 25, 4 (2003), 16–26.Google ScholarDigital Library
- Alec Go, Richa Bhayani, and Lei Huang. 2009. Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford 1, 12 (2009).Google Scholar
- Antonio Gulli and Sujit Pal. 2017. Deep learning with Keras. Packt Publishing Ltd.Google Scholar
- Julia C Hailstone, Rohani Omar, Susie MD Henley, Chris Frost, Michael G Kenward, and Jason D Warren. 2009. It’s not what you play, it’s how you play it: Timbre affects perception of emotion in music. The quarterly Journal of Experimental psychology 62, 11(2009), 2141–2155.Google Scholar
- Dorien Herremans, Ching-Hua Chuan, and Elaine Chew. 2017. A functional taxonomy of music generation systems. ACM Computing Surveys (CSUR) 50, 5 (2017), 1–30.Google ScholarDigital Library
- David A Jurgens, Peter D Turney, Saif M Mohammad, and Keith J Holyoak. 2012. Semeval-2012 task 2: Measuring degrees of relational similarity. In Proceedings of the First Joint Conference on Lexical and Computational Semantics-Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation. Association for Computational Linguistics, 356–364.Google Scholar
- Patrik N Juslin. 2001. Communicating emotion in music performance: A review and a theoretical framework.(2001).Google Scholar
- Vojislav Kecman. 2005. Support vector machines–an introduction. In Support vector machines: theory and applications. Springer, 1–47.Google Scholar
- Balázs Kégl. 2009. Introduction to AdaBoost.Google Scholar
- Hyun-Chul Lee and In-Kwon Lee. 2005. Automatic synchronization of background music and motion in computer animation. In Computer Graphics Forum, Vol. 24. Wiley Online Library, 353–361.Google Scholar
- Edward Loper and Steven Bird. 2002. NLTK: the natural language toolkit. arXiv preprint cs/0205028(2002).Google Scholar
- Lie Lu, Dan Liu, and Hong-Jiang Zhang. 2006. Automatic mood detection and tracking of music audio signals. IEEE Transactions on audio, speech, and language processing 14, 1(2006), 5–18.Google ScholarDigital Library
- Todd Lubart. 2005. How can computers be partners in the creative process: classification and commentary on the special issue. International Journal of Human-Computer Studies 63, 4-5 (2005), 365–369.Google ScholarDigital Library
- Rishi Madhok, Shivali Goel, and Shweta Garg. 2018. SentiMozart: Music Generation based on Emotions. (2018).Google Scholar
- Saif M. Mohammad. [n.d.]. Emotion, Sentiment, and Stance Labeled Data. https://web.archive.org/web/20170623152301http://saifmohammad.com:80/WebPages/SentimentEmotionLabeledData.html.Google Scholar
- Saif M. Mohammad. [n.d.]. Fine tuning a classifier in scikit-learn. http://saifmohammad.com/WebPages/EmotionIntensity-SharedTask.html.Google Scholar
- Saif M Mohammad and Felipe Bravo-Marquez. 2017. Emotion intensities in tweets. arXiv preprint arXiv:1708.03696(2017).Google Scholar
- Dan Morris, Sumit Basu, and Ian Simon. 2010. Automatic accompaniment for vocal melodies. US Patent 7,705,231.Google Scholar
- Preslav Nakov, Sara Rosenthal, Svetlana Kiritchenko, Saif M Mohammad, Zornitsa Kozareva, Alan Ritter, Veselin Stoyanov, and Xiaodan Zhu. 2016. Developing a successful SemEval task in sentiment analysis of Twitter and other social media texts. Language Resources and Evaluation 50, 1 (2016), 35–65.Google ScholarDigital Library
- BBC News. [n.d.]. Could a computer ever create better art than a human?https://www.bbc.com/news/business-47700701.Google Scholar
- Alexander Pak and Patrick Paroubek. 2010. Twitter as a corpus for sentiment analysis and opinion mining.. In LREc, Vol. 10. 1320–1326.Google Scholar
- Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, 2011. Scikit-learn: Machine learning in Python. the Journal of machine Learning research 12 (2011), 2825–2830.Google Scholar
- Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In Empirical Methods in Natural Language Processing (EMNLP). 1532–1543. http://www.aclweb.org/anthology/D14-1162Google Scholar
- Whitney Quesenbery and Kevin Brooks. 2010. Storytelling for user experience: Crafting stories for better design. Rosenfeld Media.Google Scholar
- Adhika Sigit Ramanto and Nur Ulfa Maulidevi. 2017. Markov Chain Based Procedural Music Generator with User Chosen Mood Compatibility. International Journal of Asia Digital Art and Design Association 21, 1(2017), 19–24.Google Scholar
- D Ramos, JLO Bueno, and E Bigand. 2011. Manipulating Greek musical modes and tempo affects perceived musical emotion in musicians and nonmusicians. Brazilian Journal of Medical and Biological Research 44, 2 (2011), 165–172.Google ScholarCross Ref
- Syeda Rida-E-Fatima, Ali Javed, Ameen Banjar, Aun Irtaza, Hassan Dawood, Hussain Dawood, and Abdullah Alamri. 2019. A multi-layer dual attention deep learning model with refined word embeddings for aspect-based sentiment analysis. IEEE Access 7(2019), 114795–114807.Google ScholarCross Ref
- Irina Rish 2001. An empirical study of the naive Bayes classifier. In IJCAI 2001 workshop on empirical methods in artificial intelligence, Vol. 3. 41–46.Google Scholar
- S Rasoul Safavian and David Landgrebe. 1991. A survey of decision tree classifier methodology. IEEE transactions on systems, man, and cybernetics 21, 3(1991), 660–674.Google ScholarCross Ref
- Ian Simon, Dan Morris, and Sumit Basu. 2008. MySong: automatic accompaniment generation for vocal melodies. In Proceedings of the SIGCHI conference on human factors in computing systems. 725–734.Google ScholarDigital Library
- Swathi Swaminathan and E Glenn Schellenberg. 2015. Current emotion research in music psychology. Emotion review 7, 2 (2015), 189–197.Google Scholar
- Manuelde Vega. 1996. The representation of changing emotions in reading comprehension. Cognition & Emotion 10, 3 (1996), 303–322.Google ScholarCross Ref
- Sandrine Vieillard, Isabelle Peretz, Nathalie Gosselin, Stéphanie Khalfa, Lise Gagnon, and Bernard Bouchard. 2008. Happy, sad, scary and peaceful musical excerpts for research on emotions. Cognition & Emotion 22, 4 (2008), 720–752.Google ScholarCross Ref
- Theresa Wilson, Janyce Wiebe, and Paul Hoffmann. 2005. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, 347–354.Google ScholarDigital Library
- Rene Wooller, Andrew R Brown, Eduardo Miranda, Joachim Diederich, and Rodney Berry. 2005. A framework for comparison of process in algorithmic music systems. (2005).Google Scholar
- Georgios N Yannakakis, Antonios Liapis, and Constantine Alexopoulos. 2014. Mixed-initiative co-creativity. (2014).Google Scholar
Recommendations
Dynamic Procedural Music Generation from NPC Attributes
FDG '20: Proceedings of the 15th International Conference on the Foundations of Digital GamesProcedural Music Generation in Games (PMGG) can enrich the playing experience by providing both entertainment and communication to the player. We present a system that generates unique procedural thematic music for non-player characters (NPC) based on ...
Pop Music Generation: From Melody to Multi-style Arrangement
Special Issue on KDD 2018, Regular Papers and Survey PaperMusic plays an important role in our daily life. With the development of deep learning and modern generation techniques, researchers have done plenty of works on automatic music generation. However, due to the special requirements of both melody and ...
PopMAG: Pop Music Accompaniment Generation
MM '20: Proceedings of the 28th ACM International Conference on MultimediaIn pop music, accompaniments are usually played by multiple instruments (tracks) such as drum, bass, string and guitar, and can make a song more expressive and contagious by arranging together with its melody. Previous works usually generate multiple ...
Comments