Skip to main content
Log in

Polarity-aware attention network for image sentiment analysis

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Image sentiment analysis aims to employ a computational model to automatically discover the implied emotions from the underlying image, which are crucial in many practical applications. The psychological finding demonstrates that the emotional content is ordinarily involved in some informative regions. In fact, these informative regions of visual images also convey different emotional polarities and intensities. The emotion conveyed by the whole image can be regard as the combined effect of the positive-polarity emotional regions and the negative-polarity emotional regions. Motivated by this psychological prior knowledge, we propose a new polarity-aware attention network for image sentiment analysis in an end-to-end manner. Specifically, the proposed network is composed of a sentimental feature extraction backbone, a polarity-aware attention module and a fused classification module. The backbone is used to extract the global contextual features. The polarity-aware attention module not only attends to positive and negative emotion regions by predicting polarity-aware attention maps, but also estimates their polarity intensities. The fused classification module integrates the output of the first two modules for the final image sentiment prediction. A compound loss function is designed to guide network learning using the weakly-supervised manner and the distributed label smooth learning method. We validate our method on multiple benchmarks and the experimental results demonstrate that our method can obtain superior performance over several state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Liu, B., Zhang, L.: A survey of opinion mining and sentiment analysis. In: Aggarwal, C., ChengXiang, Z. (eds.) Mining Text Data, pp. 415–463. Springer, Boston (2012). https://doi.org/10.1007/978-1-4614-3223-4_13

    Chapter  Google Scholar 

  2. Lang, P.J.: A bio-informational theory of emotional imagery. Psychophysiology 16(6), 495–512 (1979). https://doi.org/10.1111/j.1469-8986.1979.tb01511.x

    Article  Google Scholar 

  3. Zhao, S., Ding, G., Huang, Q., Chua, T.-S., Schuller, B.W., Keutzer, K.: Affective image content analysis: a comprehensive survey. In: IJCAI, pp. 5534–5541. Morgan Kaufmann, Burlington (2018)

    Google Scholar 

  4. Datta, R., Joshi, D., Li, J., Wang, J.Z.: Studying aesthetics in photographic images using a computational approach. In: Leonardis A., Bischof H.P.A. (eds.) European Conference on Computer Vision, pp. 288–301. Springer, Berlin, Heidelberg (2006). https://doi.org/10.1007/11744078_23

  5. Wei-Ning, W., Ying-Lin, Y., Sheng-Ming, J.: Image retrieval by emotional semantics: a study of emotional space and feature extraction. In: 2006 IEEE International Conference on Systems, Man and Cybernetics, vol. 4, pp. 3534–3539 (2006). https://doi.org/10.1109/ICSMC.2006.384667

  6. Yang, J., She, D., Lai, Y.-K., Rosin, P.L., Yang, M.-H.: Weakly supervised coupled networks for visual sentiment analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7584–7592. IEEE (2018)

  7. Colombo, C., Del Bimbo, A., Pala, P.: Semantics in visual information retrieval. IEEE Multimed. 6(3), 38–53 (1999). https://doi.org/10.1109/93.790610

    Article  Google Scholar 

  8. Stottinger, J., Banova, J., Ponitz, T., Sebe, N., Hanbury, A.: Translating journalists’ requirements into features for image search. In: 2009 15th International Conference on Virtual Systems and Multimedia, pp. 149–153. IEEE (2009). https://doi.org/10.1109/VSMM.2009.28

  9. Zhang, L., Wang, S., Liu, B.: Deep learning for sentiment analysis: a survey. Wiley Interdiscipl. Rev. Data Min. Knowl. Discov. 8(4), 1253 (2018). https://doi.org/10.1002/widm.1253

    Article  Google Scholar 

  10. Do, H.H., Prasad, P., Maag, A., Alsadoon, A.: Deep learning for aspect-based sentiment analysis: a comparative review. Expert Syst. Appl. 118, 272–299 (2019)

    Article  Google Scholar 

  11. Ortis, A., Farinella, G.M., Torrisi, G., Battiato, S.: Visual sentiment analysis based on on objective text description of images. In: 2018 International Conference on Content-based Multimedia Indexing (CBMI), pp. 1–6. IEEE (2018). https://doi.org/10.1109/CBMI.2018.8516481

  12. Zhang, H., Xu, M.: Weakly supervised emotion intensity prediction for recognition of emotions in images. IEEE Trans. Multimed. 23, 2033–2044 (2020). https://doi.org/10.1109/TMM.2020.3007352

    Article  Google Scholar 

  13. Yadav, A., Vishwakarma, D.K.: A deep learning architecture of RA-DLNET for visual sentiment analysis. Multimed. Syst. 26(4), 431–451 (2020). https://doi.org/10.1007/s00530-020-00656-7

    Article  Google Scholar 

  14. Machajdik, J., Hanbury, A.: Affective image classification using features inspired by psychology and art theory. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 83–92. Association for Computing Machinery, New York (2010). https://doi.org/10.1145/1873951.1873965

  15. Kumari, K., Singh, J.P.: Multi-modal cyber-aggression detection with feature optimization by firefly algorithm. Multimed. Syst. (2021). https://doi.org/10.1007/s00530-021-00785-7

    Article  Google Scholar 

  16. Xu, C., Cetintas, S., Lee, K., Li, L.: Visual sentiment prediction with deep convolutional neural networks. arXiv preprint arXiv:1411.5731 (2014)

  17. Peng, K.-C., Chen, T., Sadovnik, A., Gallagher, A.C.: A mixed bag of emotions: model, predict, and transfer emotion distributions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 860–868. IEEE (2015)

  18. Fan, S., Shen, Z., Jiang, M., Koenig, B.L., Xu, J., Kankanhalli, M.S., Zhao, Q.: Emotional attention: a study of image sentiment and visual attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7521–7531. IEEE (2018)

  19. Geng, X.: Label distribution learning. IEEE Trans. Knowl. Data Eng. 28(7), 1734–1748 (2016). https://doi.org/10.1109/TKDE.2016.2545658

    Article  Google Scholar 

  20. Zhao, Z., Liu, Q., Zhou, F.: Robust lightweight facial expression recognition network with label distribution training. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 3510–3519 (2021)

  21. You, Q., Luo, J., Jin, H., Yang, J.: Robust image sentiment analysis using progressively trained and domain transferred deep networks. In: Twenty-ninth AAAI Conference on Artificial Intelligence, pp. 381–388. AAAI (2015)

  22. Ortis, A., Farinella, G.M., Battiato, S.: An overview on image sentiment analysis: methods, datasets and current challenges. In: Proceedings of the 16th International Joint Conference on e-Business and Telecommunications-SIGMAP, pp. 290–300. SciTePress (2019). https://doi.org/10.5220/0007909602900300

  23. Wyer, R.S., Jr., Srull, T.K.: Perspectives on Anger and Emotion: Advances in Social Cognition, vol. Vi. Psychology Press, New York (2014). https://doi.org/10.4324/9781315806754

    Book  Google Scholar 

  24. Osgood, C.E.: The nature and measurement of meaning. Psychol. Bull. 49(3), 197 (1952). https://doi.org/10.1037/h0055737

    Article  Google Scholar 

  25. Russell, J.A., Mehrabian, A.: Evidence for a three-factor theory of emotions. J. Res. Pers. 11(3), 273–294 (1977). https://doi.org/10.1016/0092-6566(77)90037-X

    Article  Google Scholar 

  26. Ekman, P., Friesen, W.V., O’sullivan, M., Chan, A., Diacoyanni-Tarlatzis, I., Heider, K., Krause, R., LeCompte, W.A., Pitcairn, T., Ricci-Bitti, P.E.: Universals and cultural differences in the judgments of facial expressions of emotion. J. Pers. Soc. Psychol. 53(4), 712 (1987). https://doi.org/10.1037/0022-3514.53.4.712

    Article  Google Scholar 

  27. Mikels, J.A., Fredrickson, B.L., Larkin, G.R., Lindberg, C.M., Maglio, S.J., Reuter-Lorenz, P.A.: Emotional category data on images from the international affective picture system. Behav. Res. Methods 37(4), 626–630 (2005). https://doi.org/10.3758/BF03192732

    Article  Google Scholar 

  28. Borth, D., Ji, R., Chen, T., Breuel, T., Chang, S.-F.: Large-scale visual sentiment ontology and detectors using adjective noun pairs. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 223–232. Association for Computing Machinery, New York (2013). https://doi.org/10.1145/2502081.2502282

  29. Siersdorfer, S., Minack, E., Deng, F., Hare, J.: Analyzing and predicting sentiment of images on the social web. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 715–718. Association for Computing Machinery, New York (2010). https://doi.org/10.1145/1873951.1874060

  30. Zhu, X., Cao, B., Xu, S., Liu, B., Cao, J.: Joint visual-textual sentiment analysis based on cross-modality attention mechanism. In: International Conference on Multimedia Modeling, pp. 264–276. Springer (2019). https://doi.org/10.1007/978-3-030-05710-7_22

  31. Zhao, S., Gao, Y., Jiang, X., Yao, H., Chua, T.-S., Sun, X.: Exploring principles-of-art features for image emotion recognition. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 47–56. Association for Computing Machinery, New York (2014). https://doi.org/10.1145/2647868.2654930

  32. Borth, D., Chen, T., Ji, R., Chang, S.-F.: Sentibank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 459–460. Association for Computing Machinery, New York (2013). https://doi.org/10.1145/2502081.2502268

  33. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  34. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. IEEE (2016)

  35. Yang, J., She, D., Sun, M.: Joint image emotion classification and distribution learning via deep convolutional neural network. In: IJCAI, pp. 3266–3272. Morgan Kaufmann, Burlington (2017)

    Google Scholar 

  36. Yang, J., She, D., Sun, M., Cheng, M.-M., Rosin, P.L., Wang, L.: Visual sentiment prediction based on automatic discovery of affective regions. IEEE Trans. Multimed. 20(9), 2513–2525 (2018). https://doi.org/10.1109/TMM.2018.2803520

    Article  Google Scholar 

  37. Song, K., Yao, T., Ling, Q., Mei, T.: Boosting image sentiment analysis with visual attention. Neurocomputing 312, 218–228 (2018). https://doi.org/10.1016/j.neucom.2018.05.104

    Article  Google Scholar 

  38. Zhao, Z., Liu, Q., Wang, S.: Learning deep global multi-scale and local attention features for facial expression recognition in the wild. IEEE Trans. Image Process. 30, 6544–6556 (2021). https://doi.org/10.1109/TIP.2021.3093397

    Article  Google Scholar 

  39. Fan, S., Jiang, M., Shen, Z., Koenig, B.L., Kankanhalli, M.S., Zhao, Q.: The role of visual attention in sentiment prediction. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 217–225. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3123266.3123445

  40. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7132–7141. IEEE (2018)

  41. Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19. Springer (2018)

  42. Xia, X., Yang, L., Wei, X., Sahli, H., Jiang, D.: A multi-scale multi-attention network for dynamic facial expression recognition. Multimed. Syst. (2021). https://doi.org/10.1007/s00530-021-00849-8

    Article  Google Scholar 

  43. You, Q., Luo, J., Jin, H., Yang, J.: Building a large scale dataset for image emotion recognition: the fine print and the benchmark. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30. AAAI (2016)

  44. Peng, K.-C., Sadovnik, A., Gallagher, A., Chen, T.: Where do emotions come from predicting the emotion stimuli map. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 614–618. IEEE (2016). https://doi.org/10.1109/ICIP.2016.7532430

  45. Zhu, X., Li, L., Zhang, W., Rao, T., Xu, M., Huang, Q., Xu, D.: Dependency exploitation: a unified CNN-RNN approach for visual emotion recognition. In: IJCAI, pp. 3595–3601. Morgan Kaufmann, Burlington (2017)

    Google Scholar 

  46. Rao, T., Li, X., Zhang, H., Xu, M.: Multi-level region-based convolutional neural network for image emotion classification. Neurocomputing 333, 429–439 (2019). https://doi.org/10.1016/j.neucom.2018.12.053

    Article  Google Scholar 

  47. Wu, L., Qi, M., Jian, M., Zhang, H.: Visual sentiment analysis by combining global and local information. Neural Process. Lett. 51(3), 2063–2075 (2020). https://doi.org/10.1007/s11063-019-10027-7

    Article  Google Scholar 

Download references

Acknowledgements

This work is partially supported by National Natural Science Foundation of China under Grant (62276139, U2001211, 61802199), and partially supported by Basic Research Program of Jiangsu Province (Frontier Leading Technology) under Grant BK20192004.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yubao Sun.

Additional information

Communicated by J. Gao.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yan, Q., Sun, Y., Fan, S. et al. Polarity-aware attention network for image sentiment analysis. Multimedia Systems 29, 389–399 (2023). https://doi.org/10.1007/s00530-022-00935-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-022-00935-5

Keywords

Navigation