Object semantics sentiment correlation analysis enhanced image sentiment classification

doi:10.1016/j.knosys.2019.105245

Knowledge-Based Systems

Volume 191, 5 March 2020, 105245

https://doi.org/10.1016/j.knosys.2019.105245 Get rights and content

Abstract

With the development of artificial intelligence and deep learning, image sentiment analysis has become a hotspot in computer vision and attracts more attention. Most of the existing methods focus on identifying the emotions by studying complex models or robust features from the whole image, which neglects the influence of object semantics on image sentiment analysis. In this paper, we propose a novel object semantics sentiment correlation model (OSSCM), which is based on Bayesian network, to guide the image sentiment classification. OSSCM is constructed by exploring the relationships between image emotions and the object semantics combination in the images, which can fully consider the effect of object semantics for image emotions. Then, a convolutional neural networks (CNN) based visual sentiment analysis model is proposed to analyze image sentiment from visual aspect. Finally, three fusion strategies are proposed to realize OSSCM enhanced image sentiment classification. Experiments on public emotion datasets FI and Flickr_LDL demonstrate that our proposed image sentiment classification method can achieve good performance on image emotion analysis, and outperform state of the art methods.

Introduction

With the development of artificial intelligence and big data, computer vision for image and video data analysis has gradually become the hotspot in computer science and attracted more attention. Image sentiment classification is an important branch of image understanding, which studies the emotion response of humans on the image through combining knowledge in the fields of visual cognition theory, psychology and pattern recognition. The approaches developed for recognizing emotions evoked by images have a wide range of potential applications, such as comment assistant, aesthetic quality categorization, opinion mining, affective image retrieval, etc.

Image sentiment classification, which involves a much higher level of abstraction and subjectivity in human recognition process, is inherently more challenging [1]. Identifying human emotions aroused by images should take a rich set of cues into consideration for achieving accurate prediction results. In the literature, several sentiment analysis techniques are used to predict image emotions, including low-level visual feature-based methods [2], [3], [4], semantic-level feature-based models [5], [6], [7] and deep learning architectures based methods [8], [9], [10]. Some researches demonstrate that the deep features show superior performance against hand-crafted features in solving the problem of endowing computer with the capability of perceiving emotions in the same manner as humans. In addition, the influences of local regions on image emotions are also studied [11], [12], the results illustrate that the emotions evoked by one image is not only from its global appearance but also interplays among local regions.

Upon careful study, we find that the semantics of object combination corresponding to local regions in the image are closely related to image sentiment. Some examples are shown in Fig. 1, in which a “girl” is playing with a “puppy” with emotion label “amusement”, which is consistent with our expected emotion response. In Fig. 1(b), a “girl” and a “tiger” appear at the same time, which makes people feel afraid. We cannot help showing a mood of trepidation and worry about the safety of the “girl” while looking at this picture, which is synchronized with the sentiment label “fear”. There is no doubt that the combination of “girl” and “puppy” stimulates different sentiment from the combination of “girl” and “tiger”. Hence, the correlation between the emotions and the object semantics within the images can be applied for emotion analysis.

We herein propose a novel image sentiment classification method that makes full use of object semantics sentiment correlation to guide the visual sentiment analysis. A novel object semantics sentiment correlation model (OSSCM) based on Bayesian network is proposed to generate the self-learning knowledge base as guidance information for enhancing the visual based sentiment classification results. Then, a Convolutional Neural Networks (CNN) based visual sentiment analysis model is proposed to predict emotions from visual aspect, which encodes the entire image into a fixed dimensional representation and feeds into a fully connected layer for emotion label prediction. Finally, the guidance information from well-constructed OSSCM is used to enhance the visual sentiment analysis results for more accurate emotion prediction.

The original contributions of this paper are illustrated as follows.

•
A novel OSSCM is proposed to analyze the correlations between image emotions and the object semantics. OSSCM can quantify the effects of different object semantics combinations on different sentiment labels.
•
An object semantics sentiment correlation analysis enhanced image sentiment classification method is proposed, which uses the guidance information from OSSCM to guide the emotion probability distributions from softmax layer of CNN for the final sentiment prediction.
•
Three fusion strategies are proposed to realize the OSSCM enhanced visual sentiment classification, and our proposed method achieves good performance and outperforms state of the art methods on the public affective datasets.

The remainder of this paper is organized as follows. Section 2 reviews the related work. Section 3 describes the framework of our proposed image sentiment classification method. Subsequently, the proposed OSSCM is described in Section 4, followed by the CNN based visual sentiment analysis model in Section 5. Three fusion strategies for OSSCM enhanced image sentiment classification are described in Section 6. Experiment results and analysis are presented in Section 7, followed by the conclusions and future work in Section 8.

Section snippets

Related work

In this Section, We mainly review the methods for image sentiment classification by three aspects: image feature representation, combining entire image with local regions and multi-model sentiment analysis.

Most previous methods for image sentiment classification often utilize low-level features, such as color, texture and shape, or different groups of them to analyze image emotions based on psychology and the principles of art [2], [3], [4], [13], [14], [15], [16]. Using hand-crafted low-level

The framework of image sentiment classification

Our proposed image sentiment classification method fully takes into account the visual information of the whole image and object semantic information with their sentiment relationship in the image. The framework of the method is illustrated in Fig. 2. It primarily includes three parts: the object semantics sentiment correlation model (OSSCM), the CNN based visual sentiment analysis model and the fusion strategies for OSSCM enhanced visual sentiment classification.

OSSCM is constructed by using

Object semantics sentiment correlation model

We propose an object semantics sentiment correlation model (OSSCM) to analyze the correlation between image emotions and object semantics combinations in this paper. Next we will introduce OSSCM by two parts: OSSCM construction and OSSCM inference.

CNN based visual sentiment analysis model

From visual aspect, we exploit a CNN based model to analyze image emotions. The framework of CNN based visual sentiment analysis model is shown in Fig. 5. The classic VGGNet [34] is used for feature extraction and emotion classification. We replace the fc7 with a layer of 256 neurons, and change the fc8 classification layer to $C$ -way required by the affective datasets. In order to reduce the training time, we incorporate the weights from the pre-trained VGGNet [34] to initialize parameters of

Fusion strategies for OSSCM enhanced image sentiment classification

Based on the proposed framework, the procedure of visual sentiment classification for a given testing image can be summarized as follows. On the one hand, we input a test image into Faster R-CNN [47] to get objects semantics information. Then they are input into well-constructed OSSCM, $〈 G, Θ 〉$ to generate sentiment probability distribution ${\vec{Y}}_{t} = ({P T}_{1}, {P T}_{2}, \dots, {P T}_{C})$ , which is used as guidance information. On the other hand, we input the test image into the visual sentiment analysis model to get the

Experiments

Experiments are conducted on two widely-used affective datasets FI [24] and Flickr_LDL [48] for verifying the performance of our proposed image sentiment classification method. The experimental datasets and results are discussed as follows.

Conclusion and future work

Emotions are states of feeling triggered by specific factors. Visual sentiment analysis studies the impact of images on human emotions, which has become a hotspot in computer vision, and been widely used in image retrieval, machine recognition and robotics. In this paper, we propose a novel image sentiment classification method, which applies the correlation between object semantics and image sentiment to enhance the visual sentiment analysis model for more accurate image emotion prediction. An

Acknowledgments

This research has been supported by the National Nature Science Foundation of China (Grant 61672227 and Grant 61806078).

References (54)

RaoT. et al.
Multi-level region-based convolutional neural network for image emotion classification
Neurocomputing
(2019)
SongK. et al.
Boosting image sentiment analysis with visual attention
Neurocomputing
(2018)
WangJ. et al.
Novel binary encoding water cycle algorithm for solving bayesian network structures learning problem
Knowl.-Based Syst.
(2018)
MadsenA.L. et al.
A parallel algorithm for bayesian network structure learning from large data sets
Knowl.-Based Syst.
(2017)
LiuH. et al.
A new hybrid method for learning bayesian networks: Separation and reunion
Knowl.-Based Syst.
(2017)
LiH. et al.
A safe control scheme under the abnormity for the thickening process of gold hydrometallurgy based on bayesian network
Knowl.-Based Syst.
(2017)
AbolbashariM.H. et al.
Smart buyer: A bayesian network modelling approach for measuring and improving procurement performance in organisations
Knowl.-Based Syst.
(2018)
AriasJ. et al.
Learning distributed discrete bayesian network classifiers under mapreduce with apache spark
Knowl.-Based Syst.
(2017)
VarshneyD. et al.
Predicting information diffusion probabilities in social networks: A bayesian networks based approach
Knowl.-Based Syst.
(2017)
MoskoppM.L. et al.
Bayesian inference for the automated adjustment of an image segmentation pipeline - A modular approach applied to wound healing assays
Knowl.-Based Syst.
(2019)

GrossT.J. et al.

An analytical threshold for combining bayesian networks

Knowl.-Based Syst.

(2019)

JoshiD. et al.

Aesthetics and emotions in images

IEEE Signal Process. Mag.

(2011)

X. Lu, P. Suryanarayan, R.B. Adams, Jr., J. Li, M.G. Newman, J.Z. Wang, On shape and the computability of emotions, in:...

Y. Yang, J. Jia, S. Zhang, B. Wu, Q. Chen, J. Li, C. Xing, J. Tang, How do your friends on social media disclose your...

WangS. et al.

Multiple emotion tagging for multimedia data by exploiting high-order dependencies among emotions

IEEE Trans. Multimedia

(2015)

BorthD. et al.

Sentibank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content

J. Yuan, S. Mcdonough, Q. You, J. Luo, Sentribute: image sentiment analysis from a mid-level perspective, in:...

ZhaoS. et al.

Continuous probability distribution prediction of image emotions via multitask shared sparse regression

IEEE Trans. Multimedia

(2017)

V. Campos, A. Salvador, X. Giró i Nieto, B. Jou, Diving deep into sentiment: Understanding fine-tuned cnns for visual...

Q. You, J. Luo, H. Jin, J. Yang, Robust image sentiment analysis using progressively trained and domain transferred...

J. Wang, J. Fu, Y. Xu, T. Mei, Beyond object recognition: Visual sentiment analysis with deep coupled adjective and...

YangJ. et al.

Visual sentiment prediction based on automatic discovery of affective regions

IEEE Trans. Multimedia

(2018)

J. Machajdik, A. Hanbury, Affective image classification using features inspired by psychology and art theory, in:...

A. Sartori, D. Culibrk, Y. Yan, N. Sebe, Who’s afraid of itten: Using the art theory of color combination to analyze...

S. Zhao, Y. Gao, X. Jiang, H. Yao, T. Chua, X. Sun, Exploring principles-of-art features for image emotion recognition,...

V. Yanulevskaya, J.C. van Gemert, K. Roth, A. Herbold, N. Sebe, J. Geusebroek, Emotional valence categorization using...

BorthD. et al.

Large-scale visual sentiment ontology and detectors using adjective noun pairs

Cited by (32)

Object aroused emotion analysis network for image sentiment analysis
2024, Knowledge-Based Systems
From cognitive psychology, objects are closely related to emotions, and inherently possess the ability to arouse human emotions. Hence, fully utilizing the relationships between objects and emotions can help achieve more accurate visual emotion recognition. In this paper, we propose a novel object aroused emotion analysis network to realize image sentiment classification by investigating the interactions between objects and emotions. To quantify the various emotion potencies of each object, a novel object emotion distribution module is proposed to explore the mapping among objects and emotions, and quantitatively demonstrate how various objects arouse different emotions. An object emotion-modeling mapping module is proposed to analyze the effect of objects and object combinations with emotion information on image sentiment; this module maps common objects to emotion dimensions and improves image sentiment classification with abundant object combination information. Then, we fuse the object-emotion mapping relation and multi-model features using BiGRU, thus realizing more accurate emotion recognition. Extensive experiments on widely used emotion datasets prove that our proposed method achieves excellent performance and outperforms most state-of-the-art image sentiment classification methods.
Image sentiment classification via multi-level sentiment region correlation analysis
2022, Neurocomputing
Citation Excerpt :
Ortis et al. [43] represent the content of the image region in the form of text, thereby completing the image sentiment analysis based on the regions. Multi-model based image sentiment analysis method not only considers the importance of the image region for analyzing the emotion of the image, but also avoids losing effective information in the image that can trigger human emotion [44,45]. Based on the above analysis, we find that both the key regions and semantic have important influence on image emotion analysis.
Human’s understanding of image content is a multi-level and multi-stage process. For visual sentiment analysis, this process can be specified as the gradual perception from semantic to emotion of regions in an image. The mining of emotion-related regions is valuable for sentiment recognition, and it is even more important to further investigate the semantic associations formed between these regions. In this paper, we propose a novel multi-level sentiment region correlation analysis model, which exploits the regions in an image that are most potentially affected by emotions from multiple perspectives and motivates the interaction between sentiment regions. It makes the visual content of multi-level sentiment regions and the implicit correlations within them robust cues for image sentiment recognition. We innovatively propose a module of correlation analysis of multi-level sentiment regions to exploit the effects of higher-order and rich interactions on emotions with encoders of the Transformer. Experiments on a variety of public visual sentiment analysis datasets at different scales show that the proposed MSRCA model achieve excellent performance in image sentiment classification and outperforms other existing methods.
Artificial intelligence applications in supply chain management
2021, International Journal of Production Economics
Citation Excerpt :
With the recent advancements in AI capabilities, vision-based methods for image and video analytics have received significant interest in the field of computer science. Vision has gradually become a mainstream in the field of AI, aiming to enable computers to analyze and interpret meaningful information using data collected from vision-based sensors (Zhang et al., 2020). AI-based vision methods can be loosely categorized into machine vision and image recognition.
This paper presents a systematic review of studies related to artificial intelligence (AI) applications in supply chain management (SCM). Our systematic search of the related literature identifies 150 journal articles published between 1998 and 2020. A thorough bibliometric analysis is completed to develop the past and present state of this literature. A co-citation analysis on this pool of articles provides an understanding of the clusters of knowledge that constitute this research area. To further direct our discussions, we develop and validate an AI taxonomy which we use as a scale to conduct our bibliometric and co-citation analyses. The proposed taxonomy consists of three research categories of (a) sensing and interacting, (b) learning, and (c) decision making. These categories collectively establish the basis for present and future research on the application of AI methods in SCM literature and practice. Our analysis of the primary research clusters finds that learning methods are slowly getting momentum and sensing and interacting methods offer an emerging area of research. Finally, we provide a roadmap into future studies on AI applications in SCM. Our analysis underpins the importance of behavioral considerations in future studies.
A cognitive brain model for multimodal sentiment analysis based on attention neural networks
2021, Neurocomputing
Citation Excerpt :
To extract features, both long-short term memory (LSTM) and the convolution neural network (CNN) have the highest appearances [27–29]. [30] proposes a visual attention augment sentiment analysis architecture to identify the most distinctive regions in an image, based on which a deep CNN was built. [31] uses Bayesian network to guide a CNN to find relation between emotions and objects in an image to improve sentiment classification performances. [32]
Multimodal sentiment analysis is one of the most attractive interdisciplinary research topics in artificial intelligence (AI). Different from other classification issues, multimodal sentiment analysis of human is a much finer classification problem. However, most current work accept all multimodalities as the input together and then output final results at one time after fusion and decision processes. Rare models try to divide their models into more than one fusion modules with different fusion strategies for better adaption of different tasks. Additionally, most recent multimodal sentiment analysis methods pay great focuses on binary classification, but the accuracy of multi-classification still remains difficult to improve. Inspired by the emotional processing procedure in cognitive science, both binary and multi-classification abilities are improved in our method by dividing the complicated problem into smaller issues which are easier to be handled. In this paper, we propose a Hierarchal Attention-BiLSTM (Bidirectional Long-Short Term Memory) model based on Cognitive Brain limbic system (HALCB). HALCB splits the multimodal sentiment analysis into two modules responsible for two tasks, the binary classification and the multi-classification. The former module divides the input items into two categories by recognizing their polarity and then sends them to the latter module separately. In this module, Hash algorithm is utilized to improve the retrieve accuracy and speed. Correspondingly, the latter module contains a positive sub-net dedicated for positive inputs and a negative sub-nets dedicated for negative inputs. Each of these binary module and two sub-nets in multi-classification module possesses different fusion strategy and decision layer for matching its respective function. We also add a random forest at the final link to collect outputs from all modules and fuse them at the decision-level at last. Experiments are conducted on three datasets and compare the results with baselines on both binary classification and multi-classification tasks. Our experimental results surpass the state-of-the-art multimodal sentiment analysis methods on both binary and multi-classification by a big margin.
Visual sentiment analysis with semantic correlation enhancement
2024, Complex and Intelligent Systems
Textual emotion analysis using improved metaheuristics with deep learning model for intelligent systems
2024, Transactions on Emerging Telecommunications Technologies

View all citing articles on Scopus

^☆: No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.knosys.2019.105245.

View full text

Object semantics sentiment correlation analysis enhanced image sentiment classification☆

Abstract

Introduction

Section snippets

Related work

The framework of image sentiment classification

Object semantics sentiment correlation model

CNN based visual sentiment analysis model

Fusion strategies for OSSCM enhanced image sentiment classification

Experiments

Conclusion and future work

Acknowledgments

Neurocomputing

Neurocomputing

Knowl.-Based Syst.

Knowl.-Based Syst.

Knowl.-Based Syst.

Knowl.-Based Syst.

Knowl.-Based Syst.

Knowl.-Based Syst.

Knowl.-Based Syst.

Knowl.-Based Syst.

Knowl.-Based Syst.

Aesthetics and emotions in images

IEEE Signal Process. Mag.

Multiple emotion tagging for multimedia data by exploiting high-order dependencies among emotions

IEEE Trans. Multimedia

Sentibank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content

Continuous probability distribution prediction of image emotions via multitask shared sparse regression

IEEE Trans. Multimedia

Visual sentiment prediction based on automatic discovery of affective regions

IEEE Trans. Multimedia

Large-scale visual sentiment ontology and detectors using adjective noun pairs