Domain Adaptation Based Technique for Image Emotion Recognition Using Image Captions

Kumar, Puneet; Raman, Balasubramanian

doi:10.1007/978-981-16-1092-9_33

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1377))

Included in the following conference series:

International Conference on Computer Vision and Image Processing

1418 Accesses

Abstract

Images are powerful tools for affective content analysis. Image emotion recognition is useful for graphics, gaming, animation, entertainment, and cinematography. In this paper, a technique for recognizing the emotions in images containing facial, non-facial, and non-human components has been proposed. The emotion-labeled images are mapped to their corresponding textual captions. Then the captions are used to re-train a text emotion recognition model as the domain-adaptation approach. The adapted text emotion recognition model has been used to classify the captions into discrete emotion classes. As image captions have a one-to-one mapping with the images, the emotion labels predicted for the captions have been considered the emotion labels of the images. The suitability of using the image captions for emotion classification has been evaluated using caption-evaluation metrics. The proposed approach serves as an example to address the unavailability of sufficient emotion-labeled image datasets and pre-trained models. It has demonstrated an accuracy of 59.17% for image emotion recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.tensorflow.org/.

References

Joshi, D., et al.: Aesthetics and emotions in images. IEEE Signal Process. Mag. 28(5), 94–115 (2011)
Article Google Scholar
Kim, H.-R., Kim, Y.-S., Kim, S.J., Lee, I.-K.: Building emotional machines: recognizing image emotions through deep neural networks. IEEE Trans. Multimed. 20(11), 2980–2992 (2018)
Article Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431–3440 (2015)
Google Scholar
Machajdik, J., Hanbury., A.: Affective image classification using features inspired by psychology and art theory. In: Proceedings of the 18th ACM International Conference on Multimedia (MM), pp. 83–92 (2010)
Google Scholar
Rao, T., Li, X., Xu, M.: Learning multi-level deep representations for image emotion classification. Neural Process. Lett., 1–19 (2019)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 1097–1105 (2012)
Google Scholar
Hanjalic, A.: Extracting moods from pictures and sounds: towards personalized TV. IEEE Signal Process. Mag. 23(2), 90–100 (2006)
Article Google Scholar
Zhao, S., Gao, Y., et al.: Exploring principles-of-art features for image emotion recognition. In: Proceedings of the 22nd ACM International Conference on Multimedia (MM), pp. 47–56 (2014)
Google Scholar
Zhao, S., Yao, H., Gao, Y., Ji, R., Ding, G.: Continuous probability distribution of image emotions via multitask shared sparse regression. IEEE Trans. Multimedia 19(3), 632–645 (2016)
Article Google Scholar
Zhao, S., Ding, G., et al.: Discrete probability distribution prediction of image emotions with shared sparse learning. IEEE Trans. Affective Comput. (2018)
Google Scholar
Fernandez, P.D.M., Peña, F.A.G., Ren, T.I., Cunha, A.: FERAtt: facial expression recognition with attention net. arXiv preprint arXiv:1902.03284 (2019)
Stuhlsatz, A., Meyer, C., Eyben, F., Zielke, T., Meier, G., Schuller., B.: Deep neural networks for acoustic emotion recognition: raising the benchmarks. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5688–5691 (2011)
Google Scholar
Sahoo, S., Kumar, P., Raman, B., Roy, P.P.: A segment level approach to speech emotion recognition using transfer learning. In: Palaiahnakote, S., Sanniti di Baja, G., Wang, L., Yan, W.Q. (eds.) ACPR 2019. LNCS, vol. 12047, pp. 435–448. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-41299-9_34
Chapter Google Scholar
Poria, S., Cambria, E., Howard, N., Huang, G.-B., Hussain, A.: Fusing audio, visual and textual clues for sentiment analysis from multimodal content. Elsevier Neurocomputing 174, 50–59 (2016)
Article Google Scholar
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2009)
Article Google Scholar
Xu, K.: Show, attend and tell: neural image caption generation with visual attention. In: International Conference on Machine Learning (ICML), pp. 2048–2057 (2015)
Google Scholar
Sosa, P.M.: Twitter sentiment analysis using combined LSTM-CNN models. ACADEMIA, CS291, University of California, Santa Barbara (2017)
Google Scholar
You, Q., Luo, J., Jin, H., Yang, J.: Building a large-scale dataset for image emotion recognition: the fine print and benchmark. In: Conference on Association for the Advancement of AI (AAAI) (2016)
Google Scholar
Zhou, L., Palangi, H., Zhang, L., Hu, H., Corso, J.J., Gao, J.: Unified vision-language pre-training for image captioning and VQA. In: Conference on Association for the Advancement of AI (AAAI) (2020)
Google Scholar
Rao, T., Xu, M., Liu, H., Wang, J., Burnett, I.: Multi-scale blocks based image emotion classification using multiple instance learning. In: IEEE International Conference on Image Processing (ICIP), pp. 634–638 (2016)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu., W.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the Association for Computational Linguistics (ACL), pp. 311–318 (2002)
Google Scholar
Lin, C.-Y.: ROUGE: a package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74–81 (2004)
Google Scholar
Lavie, A., Denkowski, M.J.: The METEOR metric for automatic evaluation of machine translation. Springer Machine Translation, vol. 23, no. 2, pp. 105–115 (2009)
Google Scholar
Anderson, P., Fernando, B., Johnson, M., Gould, S.: SPICE: semantic propositional image caption evaluation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 382–398. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_24
Chapter Google Scholar

Download references

Acknowledgements

This research was supported by the Ministry of Human Resource Development (MHRD) INDIA with reference grant number: 1-3146198040.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, Roorkee, India
Puneet Kumar & Balasubramanian Raman

Authors

Puneet Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Balasubramanian Raman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Puneet Kumar .

Editor information

Editors and Affiliations

Indian Institute of Information Technology Allahabad, Prayagraj, India
Satish Kumar Singh
Indian Institute of Technology Roorkee, Roorkee, India
Partha Roy
Indian Institute of Technology Roorkee, Roorkee, India
Balasubramanian Raman
Indian Institute of Information Technology Allahabad, Prayagraj, India
P. Nagabhushan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kumar, P., Raman, B. (2021). Domain Adaptation Based Technique for Image Emotion Recognition Using Image Captions. In: Singh, S.K., Roy, P., Raman, B., Nagabhushan, P. (eds) Computer Vision and Image Processing. CVIP 2020. Communications in Computer and Information Science, vol 1377. Springer, Singapore. https://doi.org/10.1007/978-981-16-1092-9_33

Download citation

DOI: https://doi.org/10.1007/978-981-16-1092-9_33
Published: 28 March 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-1091-2
Online ISBN: 978-981-16-1092-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics