Authors:
Rishi Madhok
;
Shivali Goel
and
Shweta Garg
Affiliation:
Delhi Technological University, India
Keyword(s):
LSTM, Deep Learning, Music Generation, Emotion, Sentiment.
Related
Ontology
Subjects/Areas/Topics:
AI and Creativity
;
Artificial Intelligence
;
Biomedical Engineering
;
Biomedical Signal Processing
;
Computational Intelligence
;
Evolutionary Computing
;
Health Engineering and Technology Applications
;
Human-Computer Interaction
;
Industrial Applications of AI
;
Knowledge Discovery and Information Retrieval
;
Knowledge-Based Systems
;
Machine Learning
;
Methodologies and Methods
;
Neural Networks
;
Neurocomputing
;
Neurotechnology, Electronics and Informatics
;
Pattern Recognition
;
Physiological Computing Systems
;
Sensor Networks
;
Signal Processing
;
Soft Computing
;
Symbolic Systems
;
Theory and Methods
;
Vision and Perception
Abstract:
Facial expressions are one of the best and the most intuitive way to determine a person’s emotions. They most naturally express how a person is feeling currently. The aim of the proposed framework is to generate music corresponding to the emotion of the person predicted by our model. The proposed framework is divided into two models, the Image Classification Model and the Music Generation Model. The music would be generated by the latter model which is essentially a Doubly Stacked LSTM architecture. This is to be done after classification and identification of the facial expression into one of the seven major sentiment categories: Angry, Disgust, Fear, Happy, Sad, Surprise and Neutral, which would be done by using Convolutional Neural Networks (CNN). Finally, we evaluate the performance of our proposed framework using the emotional Mean Opinion Score (MOS) which is a popular evaluation metric for audio-visual data.