Abstract
Scenes can convey emotion like music. If that’s so, it might be possible that, given an image, one can generate music with similar emotional reaction from users. The challenge lies in how to do that. In this paper, we use the Hue, Saturation and Lightness features from a number of image samples extracted from videos excerpts and the tempo, loudness and rhythm from a number of audio samples also extracted from the same video excerpts to train a group of neural networks, including Recurrent Neural Network and Neuro-Fuzzy Network, and obtain the desired audio signal to evoke a similar emotional response to a listener. This work could prove to be an important contribution to the field of Human-Computer Interaction because it can improve the interaction between computers and humans. Experimental results show that this model effectively produces an audio that matches the video evoking a similar emotion from the viewer.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Mark, J.J.: Writing: Definition (2011). http://www.ancient.eu/writing/
Rouzic, M.: Photosounder (2008). http://photosounder.com/
White, D.: Sonicphoto (2011). http://www.skytopia.com/software/sonicphoto/
Singh, J.F.: Paint2sound (2012). http://flexibeatz.weebly.com/paint2sound.html
Zhang, Q., Jeong, S., Lee, M.: Autonomous emotion development using incremental modified adaptive neuro-fuzzy inference system. Neurocomputing 96, 33–44 (2012). Elsevier
Zhang, Q., Lee, M.: Emotion development system by interacting with human eeg and natural scene understanding. Cogn. Syst. Res. 14, 37–49 (2012). Elsevier
Yanulevskaya, V., Gemert, J.V., Roth, K., Herbold, A., Sebe, N., Geusebroek, J.: Emotional valence categorization using holistic image features. In: 15th IEEE International Conference on Image Processing (ICIP), pp. 101–104 (2008)
Mikels, J.A., Fredrickson, B.L., Larkin, G.R., Lindberg, C.M., Maglio, S.J., Reuter-Lorenz, P.A.: Emotional category data on images from the international affective picture system. Behav. Res. Meth. 37(4), 626–630 (2005)
Levitin, D.J.: Dr. daniel j. levitin: Neuroscientist, musician, author (2015). http://daniellevitin.com/publicpage/
Levitin, D.J.: This Is Your Brain on Music: The Science of a Human Obsession. Dutton Penguin Books Ltd., New York (2006)
Kim, T.: The acoustic and visual emotive signals classification in movies using brain cognitive signal and fuzzy clustering and adaptive neuro-fuzzy inference system (2014)
Lee, G., Kwon, M., Kavuri, S., Lee, M.: Emotion recognition based on 3D fuzzy visual and eeg features in movie clips. Neurocomputing 144, 560–568 (2014). Elsevier
Kwon, I.K., Lee, S.Y.: Design of emotional space modeling using neuro-fuzzy. Adv. Sci. Technol. Lett. 46, 6–9 (2014)
Lee, Giyoung, Kwon, Mingu, Sri, S.K., Lee, M.: Emotion recognition based on 3D fuzzy visual and eeg features in movie clips. Neurocomputing 144, 560–568 (2014). Elsevier
Acknowledgments
This work was supported by the Industrial Strategic Technology Development Program (10044009) funded by the Ministry of Trade, Industry and Energy (MOTIE, Korea).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Sergio, G.C., Lee, M. (2016). Audio Generation from Scene Considering Its Emotion Aspect. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds) Neural Information Processing. ICONIP 2016. Lecture Notes in Computer Science(), vol 9948. Springer, Cham. https://doi.org/10.1007/978-3-319-46672-9_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-46672-9_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46671-2
Online ISBN: 978-3-319-46672-9
eBook Packages: Computer ScienceComputer Science (R0)