skip to main content
10.1145/3206025.3206074acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Feature Selection and Multimodal Fusion for Estimating Emotions Evoked by Movie Clips

Published: 05 June 2018 Publication History

Abstract

Perceptual understanding of media content has many applications, including content-based retrieval, marketing, content optimization, psychological assessment, and affect-based learning. In this paper, we model audio visual features extracted from videos via machine learning approaches to estimate the affective responses of the viewers. We use the LIRIS-ACCEDE dataset and the MediaEval 2017 Challenge setting to evaluate the proposed methods. This dataset is composed of movies of professional or amateur origin, annotated with viewers' arousal, valence, and fear scores. We extract a number of audio features, such as Mel-frequency Cepstral Coefficients, and visual features, such as dense SIFT, hue-saturation histogram, and features from a deep neural network trained for object recognition. We contrast two different approaches in the paper, and report experiments with different fusion and smoothing strategies. We demonstrate the benefit of feature selection and multimodal fusion on estimating affective responses to movie segments.

References

[1]
Tadas Baltruvsaitis, Marwa Mahmoud, and Peter Robinson. 2015. Cross-dataset learning and person-specific normalisation for automatic action unit detection. In IEEE FG.
[2]
Yoann Baveye, Emmanuel Dellandréa, Christel Chamaret, and Liming Chen. 2015 a. Deep learning vs. kernel methods: Performance for emotion prediction in videos Proc. ACII. IEEE, 77--83.
[3]
Yoann Baveye, Emmanuel Dellandrea, Christel Chamaret, and Liming Chen. 2015 b. LIRIS-ACCEDE: A video database for affective content analysis. IEEE Transactions on Affective Computing Vol. 6, 1 (2015), pages43--55.
[4]
Anna Bosch, Andrew Zisserman, and Xavier Munoz. 2007. Image classification using random forests and ferns ICCV.
[5]
KL Brunick, JE Cutting, and JE DeLong. 2013. Low-level features of film: What they are and why we would be lost without them. Psychocinematics: Exploring cognition at the movies (2013), 133--148.
[6]
Luca Canini, Sergio Benini, and Riccardo Leonardi. 2013 a. Affective recommendation of movies based on selected connotative features. Circuits and Systems for Video Technology, IEEE Transactions on Vol. 23, 4 (2013), 636--647.
[7]
Luca Canini, Sergio Benini, and Riccardo Leonardi. 2013 b. Classifying cinematographic shot types. Multimedia tools and applications Vol. 62, 1 (2013), pages51--73.
[8]
James E Cutting, Kaitlin L Brunick, Jordan E DeLong, Catalina Iricinschi, and Ayse Candan. 2011. Quicker, faster, darker: Changes in Hollywood film over 75 years. i-Perception Vol. 2, 6 (2011), 569--576.
[9]
Ritendra Datta, Dhiraj Joshi, Jia Li, and James Z Wang. 2006. Studying aesthetics in photographic images using a computational approach. In ECCV. 288--301.
[10]
Emmanuel Dellandréa, Liming Chen, Yoann Baveye, Mats Viktor Sjöberg, Christel Chamaret, et almbox. 2016. The mediaeval 2016 emotional impact of movies task MediaEval 2016 Multimedia Benchmark Workshop Working Notes.
[11]
Florian Eyben and Björn Schuller. 2015. openSMILE: the Munich open-source large-scale multimedia feature extractor. Proc. ACM Multimedia Vol. 6, 4 (2015), 4--13.
[12]
Weiming Hu, Nianhua Xie, Li Li, Xianglin Zeng, and Stephen Maybank. 2011. A survey on visual content-based video indexing and retrieval. IEEE Trans. Systems, Man, and Cybernetics, Part C Vol. 41, 6 (2011), pages797--819.
[13]
Guang-Bin Huang, Hongming Zhou, Xiaojian Ding, and Rui Zhang. 2012. Extreme learning machine for regression and multiclass classification. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) Vol. 42, 2 (2012), 513--529.
[14]
Zitong Jin, Yuqi Yao, Ye Ma, and Mingxing Xu. 2017. THUHCSI in MediaEval 2017 Emotional Impact of Movies Task Proc. MediaEval.
[15]
Nihan Karslioglu, Yasemin Timar, Albert Ali Salah, and Heysem Kaya. 2017. BOUN-NKU in MediaEval 2017 Emotional Impact of Movies Task Proc. MediaEval.
[16]
Heysem Kaya, Tuugcce Özkaptan, Albert Ali Salah, and Fikret Gürgen. 2015. Random discriminative projection based feature selection with application to conflict recognition. IEEE Signal Processing Letters Vol. 22, 6 (2015), bibinfopages671--675.
[17]
Davis E King. 2009. Dlib-ml: A machine learning toolkit. Journal of Machine Learning Research Vol. 10, Jul (2009), pages 1755--1758.
[18]
Vu Lam, Sang Phan Le, Duy-Dinh Le, Shin'ichi Satoh, and Duc Anh Duong. 2015. NII-UIT at MediaEval 2015 Affective Impact of Movies Task Proc. MediaEval.
[19]
Michael S Lew, Nicu Sebe, Chabane Djeraba, and Ramesh Jain. 2006 Content-based multimedia information retrieval: State of the art and challenges. ACM TOMCCAP Vol. 2, 1 (2006), 1--19.
[20]
Weisi Lin and C-C Jay Kuo. 2011. Perceptual visual quality metrics: A survey. Journal of Visual Communication and Image Representation Vol. 22, 4 (2011), 297--312.
[21]
Florent Perronnin and Christopher Dance. 2007. Fisher kernels on visual vocabularies for image categorization CVPR.
[22]
James A Russell. 1980. A circumplex model of affect. Journal of personality and social psychology Vol. 39, 6 (1980), 1161--1178.
[23]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[24]
Mats Sjöberg, Yoann Baveye, Hanli Wang, Vu Lam Quang, Bogdan Ionescu, Emmanuel Dellandréa, Markus Schedl, Claire-Hélène Demarty, and Liming Chen. 2015. The MediaEval 2015 Affective Impact of Movies Task. Proc. MediaEval.
[25]
Greg M Smith. 1999. Local emotions, global moods, and film structure. Passionate views: Film, cognition, andemotion (1999), 103--26.
[26]
Greg M Smith. 2003. Film structure and the emotion system. Cambridge University Press.
[27]
A. Vedaldi and K. Lenc. 2015. MatConvNet -- Convolutional Neural Networks for MATLAB Proceeding of the ACM Int. Conf. on Multimedia.
[28]
Peter R Winters. 1960. Forecasting sales by exponentially weighted moving averages. Management science Vol. 6, 3 (1960), 324--342.
[29]
Yun Yi, Hanli Wang, Bowen Zhang, and Jian Yu. 2015. MIC-TJU in MediaEval 2015 Affective Impact of Movies Task Proc. MediaEval.
[30]
Wei Zeng, Wen Gao, and Debin Zhao. 2002. Video indexing by motion activity maps. In Proc. ICIP, Vol. Vol. 1. pages I--912.
[31]
Zhihong Zeng, Maja Pantic, Glenn I Roisman, and Thomas S Huang. 2009. A survey of affect recognition methods: Audio, visual, and spontaneous expressions. IEEE transactions on pattern analysis and machine intelligence Vol. 31, 1 (2009), 39--58.
[32]
Zoran Zivkovic and Ferdinand van der Heijden. 2006. Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern recognition lettersVol. 27, 7 (2006), pages 773--780.
[33]
Weiwei Zong, Guang-Bin Huang, and Yiqiang Chen. 2013. Weighted extreme learning machine for imbalance learning. Neurocomputing Vol. 101(2013), 229--242.

Cited By

View all
  • (2021)Recognizing Emotions evoked by Movies using Multitask Learning2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII)10.1109/ACII52823.2021.9597464(1-8)Online publication date: 28-Sep-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMR '18: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval
June 2018
550 pages
ISBN:9781450350464
DOI:10.1145/3206025
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2018

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

ICMR '18
Sponsor:

Acceptance Rates

ICMR '18 Paper Acceptance Rate 44 of 136 submissions, 32%;
Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Recognizing Emotions evoked by Movies using Multitask Learning2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII)10.1109/ACII52823.2021.9597464(1-8)Online publication date: 28-Sep-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media