Elsevier

Knowledge-Based Systems

Volume 191, 5 March 2020, 105245
Knowledge-Based Systems

Object semantics sentiment correlation analysis enhanced image sentiment classification

https://doi.org/10.1016/j.knosys.2019.105245Get rights and content

Abstract

With the development of artificial intelligence and deep learning, image sentiment analysis has become a hotspot in computer vision and attracts more attention. Most of the existing methods focus on identifying the emotions by studying complex models or robust features from the whole image, which neglects the influence of object semantics on image sentiment analysis. In this paper, we propose a novel object semantics sentiment correlation model (OSSCM), which is based on Bayesian network, to guide the image sentiment classification. OSSCM is constructed by exploring the relationships between image emotions and the object semantics combination in the images, which can fully consider the effect of object semantics for image emotions. Then, a convolutional neural networks (CNN) based visual sentiment analysis model is proposed to analyze image sentiment from visual aspect. Finally, three fusion strategies are proposed to realize OSSCM enhanced image sentiment classification. Experiments on public emotion datasets FI and Flickr_LDL demonstrate that our proposed image sentiment classification method can achieve good performance on image emotion analysis, and outperform state of the art methods.

Introduction

With the development of artificial intelligence and big data, computer vision for image and video data analysis has gradually become the hotspot in computer science and attracted more attention. Image sentiment classification is an important branch of image understanding, which studies the emotion response of humans on the image through combining knowledge in the fields of visual cognition theory, psychology and pattern recognition. The approaches developed for recognizing emotions evoked by images have a wide range of potential applications, such as comment assistant, aesthetic quality categorization, opinion mining, affective image retrieval, etc.

Image sentiment classification, which involves a much higher level of abstraction and subjectivity in human recognition process, is inherently more challenging [1]. Identifying human emotions aroused by images should take a rich set of cues into consideration for achieving accurate prediction results. In the literature, several sentiment analysis techniques are used to predict image emotions, including low-level visual feature-based methods [2], [3], [4], semantic-level feature-based models [5], [6], [7] and deep learning architectures based methods [8], [9], [10]. Some researches demonstrate that the deep features show superior performance against hand-crafted features in solving the problem of endowing computer with the capability of perceiving emotions in the same manner as humans. In addition, the influences of local regions on image emotions are also studied [11], [12], the results illustrate that the emotions evoked by one image is not only from its global appearance but also interplays among local regions.

Upon careful study, we find that the semantics of object combination corresponding to local regions in the image are closely related to image sentiment. Some examples are shown in Fig. 1, in which a “girl” is playing with a “puppy” with emotion label “amusement”, which is consistent with our expected emotion response. In Fig. 1(b), a “girl” and a “tiger” appear at the same time, which makes people feel afraid. We cannot help showing a mood of trepidation and worry about the safety of the “girl” while looking at this picture, which is synchronized with the sentiment label “fear”. There is no doubt that the combination of “girl” and “puppy” stimulates different sentiment from the combination of “girl” and “tiger”. Hence, the correlation between the emotions and the object semantics within the images can be applied for emotion analysis.

We herein propose a novel image sentiment classification method that makes full use of object semantics sentiment correlation to guide the visual sentiment analysis. A novel object semantics sentiment correlation model (OSSCM) based on Bayesian network is proposed to generate the self-learning knowledge base as guidance information for enhancing the visual based sentiment classification results. Then, a Convolutional Neural Networks (CNN) based visual sentiment analysis model is proposed to predict emotions from visual aspect, which encodes the entire image into a fixed dimensional representation and feeds into a fully connected layer for emotion label prediction. Finally, the guidance information from well-constructed OSSCM is used to enhance the visual sentiment analysis results for more accurate emotion prediction.

The original contributions of this paper are illustrated as follows.

  • A novel OSSCM is proposed to analyze the correlations between image emotions and the object semantics. OSSCM can quantify the effects of different object semantics combinations on different sentiment labels.

  • An object semantics sentiment correlation analysis enhanced image sentiment classification method is proposed, which uses the guidance information from OSSCM to guide the emotion probability distributions from softmax layer of CNN for the final sentiment prediction.

  • Three fusion strategies are proposed to realize the OSSCM enhanced visual sentiment classification, and our proposed method achieves good performance and outperforms state of the art methods on the public affective datasets.

The remainder of this paper is organized as follows. Section 2 reviews the related work. Section 3 describes the framework of our proposed image sentiment classification method. Subsequently, the proposed OSSCM is described in Section 4, followed by the CNN based visual sentiment analysis model in Section 5. Three fusion strategies for OSSCM enhanced image sentiment classification are described in Section 6. Experiment results and analysis are presented in Section 7, followed by the conclusions and future work in Section 8.

Section snippets

Related work

In this Section, We mainly review the methods for image sentiment classification by three aspects: image feature representation, combining entire image with local regions and multi-model sentiment analysis.

Most previous methods for image sentiment classification often utilize low-level features, such as color, texture and shape, or different groups of them to analyze image emotions based on psychology and the principles of art [2], [3], [4], [13], [14], [15], [16]. Using hand-crafted low-level

The framework of image sentiment classification

Our proposed image sentiment classification method fully takes into account the visual information of the whole image and object semantic information with their sentiment relationship in the image. The framework of the method is illustrated in Fig. 2. It primarily includes three parts: the object semantics sentiment correlation model (OSSCM), the CNN based visual sentiment analysis model and the fusion strategies for OSSCM enhanced visual sentiment classification.

OSSCM is constructed by using

Object semantics sentiment correlation model

We propose an object semantics sentiment correlation model (OSSCM) to analyze the correlation between image emotions and object semantics combinations in this paper. Next we will introduce OSSCM by two parts: OSSCM construction and OSSCM inference.

CNN based visual sentiment analysis model

From visual aspect, we exploit a CNN based model to analyze image emotions. The framework of CNN based visual sentiment analysis model is shown in Fig. 5. The classic VGGNet [34] is used for feature extraction and emotion classification. We replace the fc7 with a layer of 256 neurons, and change the fc8 classification layer to C-way required by the affective datasets. In order to reduce the training time, we incorporate the weights from the pre-trained VGGNet [34] to initialize parameters of

Fusion strategies for OSSCM enhanced image sentiment classification

Based on the proposed framework, the procedure of visual sentiment classification for a given testing image can be summarized as follows. On the one hand, we input a test image into Faster R-CNN [47] to get objects semantics information. Then they are input into well-constructed OSSCM, G,Θ to generate sentiment probability distribution Yt=(PT1,PT2,,PTC), which is used as guidance information. On the other hand, we input the test image into the visual sentiment analysis model to get the

Experiments

Experiments are conducted on two widely-used affective datasets FI [24] and Flickr_LDL [48] for verifying the performance of our proposed image sentiment classification method. The experimental datasets and results are discussed as follows.

Conclusion and future work

Emotions are states of feeling triggered by specific factors. Visual sentiment analysis studies the impact of images on human emotions, which has become a hotspot in computer vision, and been widely used in image retrieval, machine recognition and robotics. In this paper, we propose a novel image sentiment classification method, which applies the correlation between object semantics and image sentiment to enhance the visual sentiment analysis model for more accurate image emotion prediction. An

Acknowledgments

This research has been supported by the National Nature Science Foundation of China (Grant 61672227 and Grant 61806078).

References (54)

  • GrossT.J. et al.

    An analytical threshold for combining bayesian networks

    Knowl.-Based Syst.

    (2019)
  • JoshiD. et al.

    Aesthetics and emotions in images

    IEEE Signal Process. Mag.

    (2011)
  • X. Lu, P. Suryanarayan, R.B. Adams, Jr., J. Li, M.G. Newman, J.Z. Wang, On shape and the computability of emotions, in:...
  • Y. Yang, J. Jia, S. Zhang, B. Wu, Q. Chen, J. Li, C. Xing, J. Tang, How do your friends on social media disclose your...
  • WangS. et al.

    Multiple emotion tagging for multimedia data by exploiting high-order dependencies among emotions

    IEEE Trans. Multimedia

    (2015)
  • BorthD. et al.

    Sentibank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content

  • J. Yuan, S. Mcdonough, Q. You, J. Luo, Sentribute: image sentiment analysis from a mid-level perspective, in:...
  • ZhaoS. et al.

    Continuous probability distribution prediction of image emotions via multitask shared sparse regression

    IEEE Trans. Multimedia

    (2017)
  • V. Campos, A. Salvador, X. Giró i Nieto, B. Jou, Diving deep into sentiment: Understanding fine-tuned cnns for visual...
  • Q. You, J. Luo, H. Jin, J. Yang, Robust image sentiment analysis using progressively trained and domain transferred...
  • J. Wang, J. Fu, Y. Xu, T. Mei, Beyond object recognition: Visual sentiment analysis with deep coupled adjective and...
  • YangJ. et al.

    Visual sentiment prediction based on automatic discovery of affective regions

    IEEE Trans. Multimedia

    (2018)
  • J. Machajdik, A. Hanbury, Affective image classification using features inspired by psychology and art theory, in:...
  • A. Sartori, D. Culibrk, Y. Yan, N. Sebe, Who’s afraid of itten: Using the art theory of color combination to analyze...
  • S. Zhao, Y. Gao, X. Jiang, H. Yao, T. Chua, X. Sun, Exploring principles-of-art features for image emotion recognition,...
  • V. Yanulevskaya, J.C. van Gemert, K. Roth, A. Herbold, N. Sebe, J. Geusebroek, Emotional valence categorization using...
  • BorthD. et al.

    Large-scale visual sentiment ontology and detectors using adjective noun pairs

  • Cited by (32)

    • Image sentiment classification via multi-level sentiment region correlation analysis

      2022, Neurocomputing
      Citation Excerpt :

      Ortis et al. [43] represent the content of the image region in the form of text, thereby completing the image sentiment analysis based on the regions. Multi-model based image sentiment analysis method not only considers the importance of the image region for analyzing the emotion of the image, but also avoids losing effective information in the image that can trigger human emotion [44,45]. Based on the above analysis, we find that both the key regions and semantic have important influence on image emotion analysis.

    • Artificial intelligence applications in supply chain management

      2021, International Journal of Production Economics
      Citation Excerpt :

      With the recent advancements in AI capabilities, vision-based methods for image and video analytics have received significant interest in the field of computer science. Vision has gradually become a mainstream in the field of AI, aiming to enable computers to analyze and interpret meaningful information using data collected from vision-based sensors (Zhang et al., 2020). AI-based vision methods can be loosely categorized into machine vision and image recognition.

    • A cognitive brain model for multimodal sentiment analysis based on attention neural networks

      2021, Neurocomputing
      Citation Excerpt :

      To extract features, both long-short term memory (LSTM) and the convolution neural network (CNN) have the highest appearances [27–29]. [30] proposes a visual attention augment sentiment analysis architecture to identify the most distinctive regions in an image, based on which a deep CNN was built. [31] uses Bayesian network to guide a CNN to find relation between emotions and objects in an image to improve sentiment classification performances. [32]

    View all citing articles on Scopus

    No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.knosys.2019.105245.

    View full text