Ensemble learning on visual and textual data for social image emotion classification

Corchs, Silvia; Fersini, Elisabetta; Gasparini, Francesca

doi:10.1007/s13042-017-0734-0

Ensemble learning on visual and textual data for social image emotion classification

Original Article
Published: 07 October 2017

Volume 10, pages 2057–2070, (2019)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Silvia Corchs¹,
Elisabetta Fersini¹ &
Francesca Gasparini¹

1965 Accesses
37 Citations
Explore all metrics

Abstract

Texts, images and other information are posted everyday on the social network and provides a large amount of multimodal data. The aim of this work is to investigate if combining and integrating both visual and textual data permits to identify emotions elicited by an image. We focus on image emotion classification within eight emotion categories: amusement, awe, contentment, excitement, anger, disgust, fear and sadness. Within this classification task we here propose to adopt ensemble learning approaches based on the Bayesian model averaging method, that combine five state-of-the-art classifiers. The proposed ensemble approaches consider predictions given by several classification models, based on visual and textual data, through respectively a late and an early fusion schemes. Our investigations show that an ensemble method based on a late fusion of unimodal classifiers permits to achieve high classification performance within all of the eight emotion classes. The improvement is higher when deep image representations are adopted as visual features, compared with hand-crafted ones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on sentiment analysis methods, applications, and challenges

Article 07 February 2022

Human emotion recognition from EEG-based brain–computer interface using machine learning: a comprehensive review

Article Open access 07 May 2022

Transfer learning for image classification using VGG19: Caltech-101 image data set

Article 17 September 2021

Notes

We used WEKA (http://www.cs.waikato.ac.nz/ml/weka) to train all the baseline models, while BMA has been developed from scratch.

References

Xiong Y, Wang D, Zhang Y, Feng S, Wang G (2014) Multimodal data fusion in text-image heterogeneous graph for social media recommendation. In: International conference on web-age information management. Springer, pp 96–99
Picard RW (1999) Affective computing for HCI. In: HCI (1), pp 829–833
Scherer KR (2005) What are emotions? And how can they be measured? Soc Sci Inf 44(4):695–729
Article Google Scholar
Ressel JA (1980) A circumplex model of affect. J Personal Soc Psychol 39:1161–78
Article Google Scholar
Ekman P 1992) An argument for basic emotions. Cogn Emot 6(3–4):169–200
Article Google Scholar
Lang PJ, Bradley MM, Cuthbert BN et al (1999) International affective picture system (IAPS): instruction manual and affective ratings. The center for research in psychophysiology. University of Florida, Florida
Google Scholar
Machajdik J, Hanbury A (2010) Affective image classification using features inspired by psychology and art theory. In: Proceedings of the 18th ACM international conference on multimedia. ACM, pp 83–92
Dan-Glauser ES, Scherer Klaus R (2011) The geneva affective picture database (gaped): a new 730-picture database focusing on valence and normative significance. Behav Res Methods 43(2):468
Article Google Scholar
Joshi D, Datta R, Fedorovskaya E, Luong Q-T, Wang JZ, Li J, Luo J (2011) Aesthetics and emotions in images. IEEE Signal Process Mag 28(5):94–115
Article Google Scholar
Zhao S, Gao Y, Jiang X, Yao H, Chua TS, Sun X (2014) Exploring principles-of-art features for image emotion recognition. In: Proceedings of the 22nd ACM international conference on multimedia. ACM, pp 47–56
Pan Z, Zhang Y, Kwong S 2015) Efficient motion and disparity estimation optimization for low complexity multiview video coding. IEEE Trans Broadcast 61(2):166–176
Article Google Scholar
Pan Z, Lei J, Zhang Y, Sun X, Kwong S (2016) Fast motion estimation based on content property for low-complexity H. 265/HEVC encoder. IEEE Trans Broadcast 62(3):675–684
Article Google Scholar
Wang J, Li T, Shi YQ, Lian S, Ye J (2016) Forensics feature analysis in quaternion wavelet domain for distinguishing photographic images and computer graphics. Multimedia tools and applications, pp 1–17
Chen M, Zhang L, Allebach JP (2015) Learning deep features for image emotion classification. In: Image processing (ICIP), 2015 IEEE international conference on. IEEE, pp 4491–4495
You Q, Luo J, Jin H, Yang J (2016) Building a large scale dataset for image emotion recognition: the fine print and the benchmark. In: Proceedings of the thirtieth AAAI conference on artificial intelligence, pp 308–314
Rao T, Xu M, Xu D (2016) Learning multi-level deep representations for image emotion classification. arXiv:1611.07145 (preprint)
Zhao S, Yao H, Gao Y, Ji R, Ding G (2016) Continuous probability distribution prediction of image emotions via multi-task shared sparse regression. In: IEEE transactions on multimedia
Pozzi FA, Fersini E, Messina E, Liu B (2016) Sentiment analysis in social networks. Morgan Kaufmann, Burlington
Google Scholar
Li X, Xie H, Chen L, Wang J, Deng X 2014) News impact on stock price return via sentiment analysis. Knowl Based Syst 69:14–23
Article Google Scholar
Rao Y, Xie H, Li J, Jin F, Wang FL, Li Q (2016) Social emotion classification of short text via topic-level maximum entropy model. Inf Manag 53(8):978–986
Article Google Scholar
Niu T, Zhu S, Pang L, El Saddik A (2016) Sentiment analysis on multi-view social data. In: International conference on multimedia modeling. Springer, pp 15–27
You Q, Luo J, Jin H, Yang J (2016) Cross-modality consistent regression for joint visual-textual sentiment analysis of social multimedia. In: Proceedings of the ninth ACM international conference on web search and data mining. ACM, pp 13–22
Atrey PK, Kankanhalli MS, Oommen JB (2007) Goal-oriented optimal subset selection of correlated multimedia streams. ACM Trans Multimed Comput Commun Appl 3(1):2
Article Google Scholar
Atrey PK, Hossain MA, El Saddik A, Kankanhalli MS (2010) Multimodal fusion for multimedia analysis: a survey. Multimedia Syst 16(6):345–379
Article Google Scholar
Poria S, Cambria E, Bajpai R, Hussain A 2017) A review of affective computing: from unimodal analysis to multimodal fusion. Inf Fusion 37:98–125
Article Google Scholar
Snoek CGM, Worring M, Smeulders AWM (2005) Early versus late fusion in semantic video analysis. In: Proceedings of the 13th annual ACM international conference on multimedia. ACM, pp 399–402
Mikels JA, Fredrickson BL, Larkin GR, Lindberg CM, Maglio SJ, Reuter-Lorenz PA (2005) Emotional category data on images from the international affective picture system. Behav Res Methods 37(4):626–630
Article Google Scholar
Fersini E, Messina E, Pozzi FA (2014) Sentiment analysis: Bayesian ensemble learning. Decis Support Syst 68:26–38
Article Google Scholar
Dietterich TG (2002) Ensemble learning. In: The handbook of brain theory and neural networks, vol 2, pp 110–125
Tamura H, Mori S, Yamawaki T 1978) Textural features corresponding to visual perception. Syst Man Cybern IEEE Trans 8(6):460–473
Article Google Scholar
Mack ML, Oliva A (2004) Computational estimation of visual complexity. In: The 12th annual object, perception, attention, and memory conference
Ojala T, Pietikäinen M, Harwood D 1996) A comparative study of texture measures with classification based on featured distributions. Pattern Recogn 29(1):51–59
Article Google Scholar
Junior OL, Delgado D, Gonçalves V, Nunes U (2009) Trainable classifier-fusion schemes: an application to pedestrian detection. In: Intelligent transportation systems, 2009. ITSC’09. 12th International IEEE conference on. IEEE, pp 1–6
Ciocca G, Corchs S, Gasparini F 2016) Genetic programming approach to evaluate complexity of texture images. J Electron Imaging 25(6):061408–061408
Article Google Scholar
Comaniciu D, Meer P 2002) Mean shift: a robust approach toward feature space analysis. Pattern Anal Mach Intell IEEE Trans 24(5):603–619
Article Google Scholar
Hasler D, Suesstrunk SE (2003) Measuring colorfulness in natural images. In: Electronic imaging 2003. International Society for Optics and Photonics, pp 87–95
Rosenholtz R, Li Y, Nakano L (2007) Measuring visual clutter. J Vis 7(2):17–17
Article Google Scholar
Corchs SE, Ciocca G, Bricolo E, Gasparini F (2016) Predicting complexity perception of real world images. PLoS One 11(6):e0157986
Article Google Scholar
Marziliano P, Dufaux F, Winkler S, Ebrahimi T (2002) A no-reference perceptual blur metric. In: Image processing. 2002. Proceedings. 2002 international conference on, vol 3. IEEE, pp III–III
Schettini R, Gasparini F, Corchs S, Marini F, Capra A, Castorina A 2010) Contrast image correction method. J Electron Imaging 19(2):023005–023005
Article Google Scholar
Mittal A, Soundararajan R, Bovik AC (2013) Making a completely blind image quality analyzer. IEEE Signal Proces Lett 20(3):209–212
Article Google Scholar
Immerkaer J 1996) Fast noise variance estimation. Comput Vis Image Underst 64(2):300–302
Article Google Scholar
Minhas R, Mohammed AA, Wu QMJ, Sid-Ahmed MA (2009) 3d shape from focus and depth map computation using steerable filters. In: International conference image analysis and recognition. Springer, pp 573–583
Bhattacharya S, Nojavanasghari B, Chen T, Liu D, Chang SF, Shah M (2013) Towards a comprehensive computational model foraesthetic assessment of videos. In: Proceedings of the 21st ACM international conference on multimedia. ACM, pp 361–364
Gasparini F, Corchs S, Schettini R 2008) Recall or precision-oriented strategies for binary classification of skin pixels. J Electron Imaging 17(2):023017–023017
Article Google Scholar
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Computer vision and pattern recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE computer society conference on, vol 1. IEEE, pp I–I
Sharif Razavian A, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 806–813
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Vedaldi A,Lenc K (2015) Matconvnet—convolutional neural networks for matlab. In: Proceeding of the ACM international conference on multimedia
Breiman L 1996) Bagging predictors. Mach Learn 24(2):123–140
MATH Google Scholar
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Article MathSciNet MATH Google Scholar
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844
Article Google Scholar
Kittler J, Hatef M, Duin RPW, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20(3):226–239
Article Google Scholar
John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Eleventh conference on uncertainty in artificial intelligence. Morgan Kaufmann, San Mateo, pp 338–345
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge
Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann Publishers Inc., Burlington
MATH Google Scholar
Quinlan JR (1993) 4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco
Google Scholar
Aha DW, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66
Google Scholar
Hastie T, Tibshirani R (1998) Classification by pairwise coupling. In: Advances in neural information processing systems, pp 507–513
Cooper GF, Herskovits E (1992) A Bayesian method for the induction of probabilistic networks from data. Mach Learn 9(4):309–347
MATH Google Scholar
Rennie JD, Shih L, Teevan J, Karger DR (2003) Tackling the poor assumptions of Naive Bayes text classifiers. In: Proceedings of the 20th international conference on machine learning (ICML-03), pp 616–623
Akbani R, Kwek S, Japkowicz N (2004) Applying support vector machines to imbalanced datasets. Mach Learn ECML 2004:39–50
MATH Google Scholar
Yang J, McAuley J, Leskovec J (2013) Community detection in networks with node attributes. In: Data mining (ICDM), 2013 IEEE 13th international conference on. IEEE, pp 1151–1156
Gu B, Sheng VS (2017) A robust regularization path algorithm for \(\nu\)-support vector classification. IEEE Trans Neural Netw Learn Syst 28(5):1241–1248
Article Google Scholar
Xie H, Zou D, Lau RYK, Wang FL, Wong TL (2016) Generating incidental word-learning tasks via topic-based and load-based profiles. IEEE Multimedia 23(1):60–70
Article Google Scholar

Download references

Acknowledgements

We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research.

Author information

Authors and Affiliations

Department of Informatics, Systems and Communication, University of Milano Bicocca, Viale Sarca 336, 20126, Milano, Italy
Silvia Corchs, Elisabetta Fersini & Francesca Gasparini

Authors

Silvia Corchs
View author publications
You can also search for this author in PubMed Google Scholar
Elisabetta Fersini
View author publications
You can also search for this author in PubMed Google Scholar
Francesca Gasparini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Silvia Corchs.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Corchs, S., Fersini, E. & Gasparini, F. Ensemble learning on visual and textual data for social image emotion classification. Int. J. Mach. Learn. & Cyber. 10, 2057–2070 (2019). https://doi.org/10.1007/s13042-017-0734-0

Download citation

Received: 01 April 2017
Accepted: 04 October 2017
Published: 07 October 2017
Issue Date: 01 August 2019
DOI: https://doi.org/10.1007/s13042-017-0734-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Ensemble learning on visual and textual data for social image emotion classification

Abstract

Access this article

Similar content being viewed by others

A survey on sentiment analysis methods, applications, and challenges

Human emotion recognition from EEG-based brain–computer interface using machine learning: a comprehensive review

Transfer learning for image classification using VGG19: Caltech-101 image data set

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Ensemble learning on visual and textual data for social image emotion classification

Abstract

Access this article

Similar content being viewed by others

A survey on sentiment analysis methods, applications, and challenges

Human emotion recognition from EEG-based brain–computer interface using machine learning: a comprehensive review

Transfer learning for image classification using VGG19: Caltech-101 image data set

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation