Skip to main content

Domain Adaptation Based Technique for Image Emotion Recognition Using Image Captions

  • Conference paper
  • First Online:
Computer Vision and Image Processing (CVIP 2020)

Abstract

Images are powerful tools for affective content analysis. Image emotion recognition is useful for graphics, gaming, animation, entertainment, and cinematography. In this paper, a technique for recognizing the emotions in images containing facial, non-facial, and non-human components has been proposed. The emotion-labeled images are mapped to their corresponding textual captions. Then the captions are used to re-train a text emotion recognition model as the domain-adaptation approach. The adapted text emotion recognition model has been used to classify the captions into discrete emotion classes. As image captions have a one-to-one mapping with the images, the emotion labels predicted for the captions have been considered the emotion labels of the images. The suitability of using the image captions for emotion classification has been evaluated using caption-evaluation metrics. The proposed approach serves as an example to address the unavailability of sufficient emotion-labeled image datasets and pre-trained models. It has demonstrated an accuracy of 59.17% for image emotion recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.tensorflow.org/.

References

  1. Joshi, D., et al.: Aesthetics and emotions in images. IEEE Signal Process. Mag. 28(5), 94–115 (2011)

    Article  Google Scholar 

  2. Kim, H.-R., Kim, Y.-S., Kim, S.J., Lee, I.-K.: Building emotional machines: recognizing image emotions through deep neural networks. IEEE Trans. Multimed. 20(11), 2980–2992 (2018)

    Article  Google Scholar 

  3. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431–3440 (2015)

    Google Scholar 

  4. Machajdik, J., Hanbury., A.: Affective image classification using features inspired by psychology and art theory. In: Proceedings of the 18th ACM International Conference on Multimedia (MM), pp. 83–92 (2010)

    Google Scholar 

  5. Rao, T., Li, X., Xu, M.: Learning multi-level deep representations for image emotion classification. Neural Process. Lett., 1–19 (2019)

    Google Scholar 

  6. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 1097–1105 (2012)

    Google Scholar 

  7. Hanjalic, A.: Extracting moods from pictures and sounds: towards personalized TV. IEEE Signal Process. Mag. 23(2), 90–100 (2006)

    Article  Google Scholar 

  8. Zhao, S., Gao, Y., et al.: Exploring principles-of-art features for image emotion recognition. In: Proceedings of the 22nd ACM International Conference on Multimedia (MM), pp. 47–56 (2014)

    Google Scholar 

  9. Zhao, S., Yao, H., Gao, Y., Ji, R., Ding, G.: Continuous probability distribution of image emotions via multitask shared sparse regression. IEEE Trans. Multimedia 19(3), 632–645 (2016)

    Article  Google Scholar 

  10. Zhao, S., Ding, G., et al.: Discrete probability distribution prediction of image emotions with shared sparse learning. IEEE Trans. Affective Comput. (2018)

    Google Scholar 

  11. Fernandez, P.D.M., Peña, F.A.G., Ren, T.I., Cunha, A.: FERAtt: facial expression recognition with attention net. arXiv preprint arXiv:1902.03284 (2019)

  12. Stuhlsatz, A., Meyer, C., Eyben, F., Zielke, T., Meier, G., Schuller., B.: Deep neural networks for acoustic emotion recognition: raising the benchmarks. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5688–5691 (2011)

    Google Scholar 

  13. Sahoo, S., Kumar, P., Raman, B., Roy, P.P.: A segment level approach to speech emotion recognition using transfer learning. In: Palaiahnakote, S., Sanniti di Baja, G., Wang, L., Yan, W.Q. (eds.) ACPR 2019. LNCS, vol. 12047, pp. 435–448. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-41299-9_34

    Chapter  Google Scholar 

  14. Poria, S., Cambria, E., Howard, N., Huang, G.-B., Hussain, A.: Fusing audio, visual and textual clues for sentiment analysis from multimodal content. Elsevier Neurocomputing 174, 50–59 (2016)

    Article  Google Scholar 

  15. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2009)

    Article  Google Scholar 

  16. Xu, K.: Show, attend and tell: neural image caption generation with visual attention. In: International Conference on Machine Learning (ICML), pp. 2048–2057 (2015)

    Google Scholar 

  17. Sosa, P.M.: Twitter sentiment analysis using combined LSTM-CNN models. ACADEMIA, CS291, University of California, Santa Barbara (2017)

    Google Scholar 

  18. You, Q., Luo, J., Jin, H., Yang, J.: Building a large-scale dataset for image emotion recognition: the fine print and benchmark. In: Conference on Association for the Advancement of AI (AAAI) (2016)

    Google Scholar 

  19. Zhou, L., Palangi, H., Zhang, L., Hu, H., Corso, J.J., Gao, J.: Unified vision-language pre-training for image captioning and VQA. In: Conference on Association for the Advancement of AI (AAAI) (2020)

    Google Scholar 

  20. Rao, T., Xu, M., Liu, H., Wang, J., Burnett, I.: Multi-scale blocks based image emotion classification using multiple instance learning. In: IEEE International Conference on Image Processing (ICIP), pp. 634–638 (2016)

    Google Scholar 

  21. Papineni, K., Roukos, S., Ward, T., Zhu., W.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the Association for Computational Linguistics (ACL), pp. 311–318 (2002)

    Google Scholar 

  22. Lin, C.-Y.: ROUGE: a package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74–81 (2004)

    Google Scholar 

  23. Lavie, A., Denkowski, M.J.: The METEOR metric for automatic evaluation of machine translation. Springer Machine Translation, vol. 23, no. 2, pp. 105–115 (2009)

    Google Scholar 

  24. Anderson, P., Fernando, B., Johnson, M., Gould, S.: SPICE: semantic propositional image caption evaluation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 382–398. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_24

    Chapter  Google Scholar 

Download references

Acknowledgements

This research was supported by the Ministry of Human Resource Development (MHRD) INDIA with reference grant number: 1-3146198040.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Puneet Kumar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kumar, P., Raman, B. (2021). Domain Adaptation Based Technique for Image Emotion Recognition Using Image Captions. In: Singh, S.K., Roy, P., Raman, B., Nagabhushan, P. (eds) Computer Vision and Image Processing. CVIP 2020. Communications in Computer and Information Science, vol 1377. Springer, Singapore. https://doi.org/10.1007/978-981-16-1092-9_33

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-1092-9_33

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-1091-2

  • Online ISBN: 978-981-16-1092-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics