Skip to main content
Log in

Recognizing Image Semantic Information Through Multi-Feature Fusion and SSAE-Based Deep Network

  • Image & Signal Processing
  • Published:
Journal of Medical Systems Aims and scope Submit manuscript

Abstract

Images are powerful tools with which to convey human emotions, with different images stimulating diverse emotions. Numerous factors affect the emotions stimulated by the image, and many researchers have previously focused on low-level features such as color, texture and so on. Inspired by the successful use of deep convolutional neural networks (CNN) in the visual recognition field, we used a data augmentation method for small data sets to gain the sufficient number of the training dataset. In this paper, we use low-level features (color and texture features) of the image to assist the extraction of advanced features (image object category features and deep emotion features of images), which are automatically learned by deep networks, to obtain more effective image sentiment features. Then, we use the stack sparse auto-encoding network to recognize the emotions evoked by the image. Finally, high-level semantic descriptive phrases including image emotions and objects are output. Our experiments are carried out on the IAPS and GAPED data sets of the dimension space and the artphoto data set of the discrete space. Compared with the traditional manual extraction methods and other existing models, our method is superior to in terms of test performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Lang, P. J., Bradley, M. M., and Cuthbert, B. N., Emotion, motivation, and anxiety: brain mechanisms and psychophysiology. Biological Psychiatry 44(12):1248–1263, 1998.

    Article  CAS  Google Scholar 

  2. Joshi, D., Datta, R., Fedorovskaya, E. et al., Aesthetics and emotions in images. IEEE Signal Processing Magazine 28(5):94–115, 2011.

    Article  Google Scholar 

  3. Wang, W. N., and Yu, Y. L., A Survey of Image Emotional Semantic Research. Journal of Circuits and Systems 8(5):101–109, 2003.

    Google Scholar 

  4. Zhao S, Gao Y, Jiang X, et al. (2014) Exploring principles-of-art features for image emotion recognition. Proceedings of the 22nd ACM international conference on multimedia. USA:ACM, 2014, 47–56.

  5. Krizhevsky A, Sutskever I, Hinton G E, ImageNet classification with deep convolutional neural networks. International Conference on Neural Information Processing Systems. Curran Associates Inc, 2012, 1097–1105.

  6. Simonyan K, Zisserman A, Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv preprint arXiv:1409.1556, 2014.

  7. Long, J., Shelhamer, E., and Darrell, T., Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis & Machine Intelligence 39(4):640–651, 2014.

    Google Scholar 

  8. Ren S, He K, Girshick R, et al., Faster R-CNN: towards real-time object detection with region proposal networks. International Conference on Neural Information Processing Systems. MIT Press, 2015, 91–99.

  9. Cordts, M., Omran, M., Ramos, S. et al., The cityscapes dataset for semantic urban scene understanding. Proceedings of the IEEE conference on computer vision and pattern recognition:3213–3223, 2016.

  10. You Q, Luo J, Jin H, Yang J, Robust image sentiment analysis using progressively trained and domain transferred deep networks. National conference on artificial intelligence, 2015, 381–388.

  11. Peng K C, Chen T, Sadovnik A, et al., A mixed bag of emotions: Model, predict, and transfer emotion distributions. Computer Vision and Pattern Recognition. IEEE, 2015, 860–868.

  12. Campos V, Salvador A, Giro-I-Nieto X, et al., Diving Deep into Sentiment: Understanding Fine-tuned CNNs for Visual Sentiment Prediction. International Workshop on Affect & Sentiment in Multimedia. ACM, 2015, 57–62.

  13. You Q, Luo J, Jin H, et al., Building a large scale dataset for image emotion recognition: the fine print and the benchmark. National conference on artificial intelligence, 2016, 308–314.

  14. Rao T, Xu M, Xu D, Learning multi-level deep representations for image emotion classification. arXiv preprint arXiv:1611.07145, 2016.

  15. Chen T, Borth D, Darrell T, et al., DeepSentiBank: Visual Sentiment Concept Classification with Deep Convolutional Neural Networks. arXiv preprint arXiv:1410.8586, 2014.

  16. Borth D, Ji R, Chen T, et al., Large-scale visual sentiment ontology and detectors using adjective noun pairs. Proceedings of the 21st ACM international conference on Multimedia. ACM, 2013, 223–232.

  17. Alamedapineda X, Ricci E, Yan Y, et al., Recognizing Emotions from Abstract Paintings Using Non-Linear Matrix Completion. computer vision and pattern recognition, 2016, 5240–5248.

  18. Deng J, Dong W, Socher R, et al., ImageNet: A large-scale hierarchical image database. Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 2009, 248–255.

  19. Hinton, G. E., Osindero, S., and Teh, Y. W., A Fast Learning Algorithm for Deep Belief Nets. Neural Computation 18(7):1527–1554, 2006.

    Article  Google Scholar 

  20. Rueda-Plata D, Ramos-Pollán R, González F A, Supervised greedy layer-wise training for deep convolutional networks with small datasets. international conference on computational collective intelligence, 2015, 275–84.

  21. Kim, H. R., Kim, Y. S., Kim, S. J. et al., Building emotional machines: Recognizing image emotions through deep neural networks. IEEE Transactions on Multimedia 20(11):2980–2992, 2018.

    Article  Google Scholar 

  22. Sohaib, M., Kim J M. Reliable Fault Diagnosis of Rotary Machine Bearings Using a Stacked Sparse Autoencoder-Based Deep Neural Network. Shock and Vibration 1-11, 2018.

    Article  Google Scholar 

  23. Mehrabian, A., Pleasure-arousal-dominance: A general framework for describing and measuring individual differences in temperament. Current Psychology 14(4):261–292, 1996.

    Article  Google Scholar 

  24. Lang P J, Bradley M M, Cuthbert B N, International affective picture system (IAPS): Technical manual and affective ratings. NIMH Center for the Study of Emotion and Attention 1:39–58, 1997.

  25. Danglauser, E. S., and Scherer, K. R., The Geneva affective picture database (GAPED): A new 730-picture database focusing on valence and normative significance. Behavior Research Methods 43(2):468, 2011.

    Article  Google Scholar 

  26. Ioffe S, Szegedy C, Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015.

  27. Hinton G E, Zemel R S, Autoencoders, Minimum Description Length and Helmholtz Free Energy. neural information processing systems, 1993, 3–10

  28. Vincent P, Larochelle H, Bengio Y, et al., Extracting and composing robust features with denoising autoencoders. international conference on machine learning, 2008, 1096–1103.

  29. Machajdik J, Hanbury A, Affective image classification using features inspired by psychology and art theory. ACM multimedia, 2010, 83–92.

  30. Szegedy C, Ioffe S, Vanhoucke V, et al., Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. national conference on artificial intelligence 4278–4284, 2016.

  31. Wilson, A. C., Roelofs, R., Stern, M. et al., The marginal value of adaptive gradient methods in machine learning. Advances in Neural Information Processing Systems:4148–4158, 2017.

  32. Gulcehre, C., Moczulski, M., Denil, M. et al., Noisy activation functions. International conference on machine learning:3059–3068, 2016.

Download references

Acknowledgments

This work is partially supported by National Natural Science Foundation of China (No.61976150, No.61873178 and No.61876124), Natural Science Foundation of Shanxi Province (No.201801D121135), Key Research and Development (R&D) Projects of Shanxi Province (No.201803D31038), Key Research and Development (R&D) Projects of Jinzhong City (No.Y192006).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haifang Li.

Ethics declarations

Conflict of Interest

The authors have no conflict of interest in submitting the paper to Journal of Medical systems.

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the Topical Collection on Image & Signal Processing

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, X., Wang, Z., Deng, H. et al. Recognizing Image Semantic Information Through Multi-Feature Fusion and SSAE-Based Deep Network. J Med Syst 44, 46 (2020). https://doi.org/10.1007/s10916-019-1498-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10916-019-1498-8

Keywords

Navigation