ABSTRACT
This study explores the development of intelligent affective virtual environments generated by bimodal emotion recognition techniques and multimodal feedback. A semantic and acoustic analysis predicts emotions conveyed by spoken language, fostering an expressive and transparent control structure. Textual contents and emotional predictions are mapped to virtual environments in real locations as audiovisual feedback.
To demonstrate the application of this system, we developed a case study titled "En train d'oublier," focusing on a train cemetery in Uyuni, Bolivia. The train cemetery holds historical significance as a site where abandoned trains symbolize the passage of time and the interaction between human activities and nature's reclamation. The space is transformed into an immersive and emotionally poetic experience through oral language and affective virtual environments that activate memories, as the system utilizes the transcribed text to synthesize images and modifies the musical output based on the predicted emotional states.
The proposed bimodal emotion recognition techniques achieve 94% and 89% accuracy. The audiovisual mapping strategy allows for considering divergence in predictions generating an intended tension between the graphical and the musical representation. Using video and web art techniques, we experimented with the environments generated to create diverses poetic proposals.
- Aylett, R., and M. Cavazza. 2001. “Intelligent virtual environments – a state-of-the-art report.” Proceedings of the Eurographics Workshop in Manchester, UK.Google Scholar
- Kamath, Rajani, and Rajanish Kamat. 2013. “Development of an Intelligent Virtual Environment for Augmenting Natural Language Processing in Virtual Reality Systems.” International Journal of Emerging Trends & Technology in Computer Science (IJETTCS), 198-203.Google Scholar
- Everett, Stephanie, Kenneth Wauchope, and Manuel Pérez. 1995. “A Natural Language Interface for Virtual Reality Systems.”Google Scholar
- Levin, Golan, and Zachary Lieberman. 2004. “In-situ speech visualization in real-time interactive installation and performance.”Google Scholar
- Kitayama, Shinobu, and Hazel R. Markus. 1994. “Emotion and culture: Empirical studies of mutual influence.” American Psychological Association, pp. 1–19.Google Scholar
- Roach, Peter. 2000. “Techniques for the phonetic description of emotional speech.” Proceedings of the ISCA Workshop on Speech and Emotion.Google Scholar
- Russell, James. 1980. “A Circumplex Model of Affect.” Journal of Personality and Social Psychology 39(6).Google ScholarCross Ref
- Scherer, Klaus. 2005. “What are emotions? And how can they be measured?” Social Science Information 44:695-729.Google ScholarCross Ref
- Dash, Adyasha and Kat R. Agres. “AI-Based Affective Music Generation Systems: A Review of Methods, and Challenges.” ArXiv abs/2301.06890 (2023): n. Pag.Google Scholar
- Cowie, Roddy, Ellen Douglas-Cowie, Nicolas Tsapatsoulis, George Votsis, Stefanos Kollias, Winfried Fellenz, and John G. Taylor. 2001. “Emotion recognition in human-computer interaction.” IEEE Signal Processing magazine 18:32–80.Google ScholarCross Ref
- Sra, Misha, Pattie Maes, Prashanth Vijayaraghavan, and Deb Roy. 2017. “Auris: creating affective virtual spaces from music.” Proceedings of the 23rd ACM Symposium on Virtual Reality Software and Technology (VRST '17), 1-11.Google ScholarDigital Library
- Pinilla, Andrés, Andrés Garcia, William Raffe, Jan-Niklas Voigt-Antons, Robert Spang, and Sebastian Müller. 2021. “Affective Visualization in Virtual Reality: An Integrative Review.” Frontiers in Virtual Reality.Google Scholar
- Jorge Forero, Gilberto Bernardes, and Mónica Mendes. 2022. Emotional Machines: Toward Affective Virtual Environments. In Proceedings of the 30th ACM International Conference on Multimedia (MM '22). Association for Computing Machinery, New York, NY, USA, 7237–7238. https://doi.org/10.1145/3503161.3549973Google ScholarDigital Library
- Forero, J., Bernardes, G., Mendes, M. (2023). Desiring Machines and Affective Virtual Environments. In: Brooks, A.L. (eds) ArtsIT, Interactivity and Game Creation. ArtsIT 2022. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 479. Springer, Cham. https://doi.org/10.1007/978-3-031-28993-4_28]Google ScholarCross Ref
- Mehrabian, A. and Wiener, M. (1967). Decoding of inconsistent communications, Journal of Personality and Social Psychology, 6, 109-114.Google ScholarCross Ref
- Mehrabian, A., and Ferris, S.R. (1967), Inference of Attitudes from Nonverbal Communication in Two Channels, Journal of Consulting Psychology, 31, 3, 48-258.Google ScholarCross Ref
- Saravia, Elvis & Liu, Hsien-Chi & Huang, Yen-Hao & Wu, Junlin & Chen, Yi-Shin. (2018). CARER: Contextualized Affect Representations for Emotion Recognition. 3687-3697. 10.18653/v1/D18-1404.Google Scholar
- Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.Google Scholar
- Livingstone SR, Russo FA. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PLoS One. 2018 May 16;13(5):e0196391. doi: 10.1371/journal.pone.0196391. PMID: 29768426; PMCID: PMC5955500.Google ScholarCross Ref
- Bradley, M.M., & Lang, P.J. (1999). Affective norms for English words (ANEW): Instruction manual and affective ratings. Technical Report C-1 The Center for Research in Psychophysiology, University of Florida.Google Scholar
- Bernardes, Gilberto, Diogo Cocharro, Carlos Guedes, and Matthew Davies. 2016. “Conchord: An Application for Generating Musical Harmony by Navigating in the Tonal Interval Space.” Lecture Notes in Computer Science.Google Scholar
- Livingstone, Steven & Muhlberger, Ralf & Brown, Andrew & Loch, Andrew. (2007). Controlling musical emotionality: An affective computational architecture for influencing musical emotions. Digital Creativity. 18. 10.1080/14626260701253606.Google Scholar
Index Terms
- En train d'oublier: toward affective virtual environments
Recommendations
Emotional Machines: Toward Affective Virtual Environments
MM '22: Proceedings of the 30th ACM International Conference on MultimediaEmotional Machines is an interactive installation that builds affective virtual environments through spoken language. In response to the existing limitations of emotion recognition models incorporating computer vision and electrophysiological activity, ...
Affective computing with primary and secondary emotions in a virtual human
We introduce the WASABI ([W]ASABI [A]ffect [S]imulation for [A]gents with [B]elievable [I]nteractivity) Affect Simulation Architecture, in which a virtual human's cognitive reasoning capabilities are combined with simulated embodiment to achieve the ...
Human perception of a conversational virtual human: an empirical study on the effect of emotion and culture
Virtual reality applications with virtual humans, such as virtual reality exposure therapy, health coaches and negotiation simulators, are developed for different contexts and usually for users from different countries. The emphasis on a virtual human's ...
Comments