Skip to main content
Log in

A multimodal feature learning approach for sentiment analysis of social network multimedia

Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper we investigate the use of a multimodal feature learning approach, using neural network based models such as Skip-gram and Denoising Autoencoders, to address sentiment analysis of micro-blogging content, such as Twitter short messages, that are composed by a short text and, possibly, an image. The approach used in this work is motivated by the recent advances in: i) training language models based on neural networks that have proved to be extremely efficient when dealing with web-scale text corpora, and have shown very good performances when dealing with syntactic and semantic word similarities; ii) unsupervised learning, with neural networks, of robust visual features, that are recoverable from partial observations that may be due to occlusions or noisy and heavily modified images. We propose a novel architecture that incorporates these neural networks, testing it on several standard Twitter datasets, and showing that the approach is efficient and obtains good classification results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. Twitter reports to have 271 million monthly active users that send 500 million status updates per day - https://about.twitter.com/company

  2. https://blog.twitter.com/2014/what-fuels-a-tweets-engagement

  3. http://sananalytics.com/lab/twitter-sentiment/

  4. http://help.sentiment140.com/for-students

  5. http://www.cs.york.ac.uk/semeval-2013/task2/

  6. http://www.ee.columbia.edu/ln/dvmm/vso/download/sentibank.html

  7. http://sentistrength.wlv.ac.uk/

References

  1. Baecchi C, Turchini F, Seidenari L, Bagdanov AD, Del Bimbo A (2014) Fisher vectors over random density forests for object recognition. In: Proceeding of international conference on pattern recognition (ICPR)

  2. Barbosa L, Feng J (2010) Robust sentiment detection on twitter from biased and noisy data. In: Proceeding of international conference on computational linguistics (COLING)

  3. Bastien F, Lamblin P, Pascanu R, Bergstra J, Goodfellow I, Bergeron A, Bouchard N, Warde-Farley D, Bengio Y (2012) Theano: new features and speed improvements. arXiv:1211.5590

  4. Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2(1):1–127. doi:10.1561/2200000006

    Article  MathSciNet  MATH  Google Scholar 

  5. Bengio Y, Schwenk H, Senécal JS, Morin F, Gauvain JL (2006) Neural probabilistic language models. In: Innovations in machine learning. Springer, pp 137–186

  6. Bian J, Yang Y, Chua TS (2013) Multimedia summarization for trending topics in microblogs. In: Proceeding of the ACM international conference on information and knowledge management (CIKM), pp 1807–1812. doi:10.1145/2505515.2505652

  7. Bifet A, Frank E (2010) Sentiment knowledge discovery in twitter streaming data. In: Proceedings of international conference on discovery science (DS). doi:10.1007/978-3-642-16184-1_1

  8. Borth D, Ji R, Chen T, Breuel T, Chang SF (2013) Large-scale visual sentiment ontology and detectors using adjective noun pairs. In: Proceeding of ACM international conference on multimedia (MM), pp 223–232. doi:10.1145/2502081.2502282

  9. Bravo-Marquez F, Mendoza M, Poblete B (2013) Combining strengths, emotions and polarities for boosting Twitter sentiment analysis. In: Proceeding of ACM international workshop on issues of sentiment discovery and opinion mining (WISDOM). doi:10.1145/2502069.2502071

  10. Cao D, Ji R, Lin D, Li S (2014) A cross-media public sentiment analysis system for microblog. Multimedia Systems (MS):1–8. doi:10.1007/s00530-014-0407-8

  11. Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details: delving deep into convolutional nets. arXiv:1405.3531

  12. Chen T, Lu D, Kan MY, Cui P (2013) Understanding and classifying image tweets. In: Proceeding of ACM international conference on multimedia (MM), pp 781–784. doi:10.1145/2502081.2502203

  13. Chen YY, Chen T, Hsu WH, Liao HYM, Chang SF (2014) Predicting viewer affective comments based on image content in social media. In: Proceeding of ACM international conference on multimedia retrieval (ICMR), pp 233:233–233:240. doi:10.1145/2578726.2578756,

  14. Collobert R, Weston J (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceeding of international conference on machine learning (ICML)

  15. Dan-Glauser E, Scherer K (2011) The geneva affective picture database (gaped): A new 730-picture database focusing on valence and normative significance. Behav Res Methods 43(2):468–477. doi:10.3758/s13428-011-0064-1

    Article  Google Scholar 

  16. Deitrick W, Hu W (2013) Mutually enhancing community detection and sentiment analysis on Twitter networks. J Data Anal Inf Process 1(3):19.29

    Google Scholar 

  17. Ghiassi M, Skinner J, Zimbra D (2013) Twitter brand sentiment analysis: A hybrid system using n-gram analysis and dynamic artificial neural network. Expert Syst Appl 40(16):6266–6282. doi:10.1016/j.eswa.2013.05.057

    Article  Google Scholar 

  18. Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. Tech. rep., CS224N Project Report, Stanford

  19. Grauman K, Darrell T (2005) The pyramid match kernel: Discriminative classification with sets of image features. In: Proceeding of international conference on computer vision (ICCV)

  20. Gutmann MU, Hyvärinen A (2012) Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. J Mach Learn Res (JMLR) 13(1):307–361

    MathSciNet  MATH  Google Scholar 

  21. Jiang L, Yu M, Zhou M, Liu X, Zhao T (2011) Target-dependent Twitter sentiment classification. In: Proceeding of ACL annual meeting of the association for computational linguistics: Human language Technologies (HLT)

  22. Joshi D, Datta R, Fedorovskaya E, Luong QT, Wang J, Li J, Luo J (2011) Aesthetics and emotions in images. IEEE Signal Proc Mag (MSP) 28(5):94–115. doi:10.1109/MSP.2011.941851

    Article  Google Scholar 

  23. Kaneko T, Harada H, Yanai K (2013) Twitter visual event mining system. In: Proceeding of IEEE international conference on multimedia and expo workshops (ICMEW), pp 1–2. doi:10.1109/ICMEW.2013.6618224

  24. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceeding of neural information processing systems (NIPS), pp 1097–1105

  25. Lang PJ, Bradley MM, Cuthbert BN (1999) International affective picture system (iaps): Technical manual and affective ratings

  26. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Proceeding of conference on computer vision and pattern recognition (CVPR)

  27. Le QV, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceeding of international conference on machine learning (ICML)

  28. Liu KL, Li WJ, Guo M (2012) Emoticon smoothed language models for Twitter sentiment analysis. In: Proceeding of AAAI conference on artificial intelligence (CAI)

  29. Li T, Mei T, Kweon IS, Hua XS (2011) Contextual bag-of-words for visual categorization. IEEE Trans Circ Syst Video Technol (TCSVT) 21(4):381–392

    Article  Google Scholar 

  30. McParlane PJ, Jose J (2014) Exploiting twitter and wikipedia for the annotation of event images. In: Proceeding of ACM SIGIR interantional conference on research and development in information retrieval , pp 1175–1178. doi:10.1145/2600428.2609538

  31. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781

  32. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceeding of neural information processing systems (NIPS)

  33. Mikolov T, Deoras A, Kombrink S, Burget L, Cernocky JH (2011) Empirical evaluation and combination of advanced language modeling techniques. In: Proceeding of interspeech

  34. Mnih A, Hinton GE (2009) A scalable hierarchical distributed language model. In: Proceedings of neural information processing systems (NIPS)

  35. Perronnin F, Liu Y, Sánchez J, Poirier H (2010) Large-scale image retrieval with compressed fisher vectors. In: Proceeding of computer vision and pattern recognition (CVPR)

  36. Perronnin F, Sánchez J, Mensink T (2010) Improving the fisher kernel for large-scale image classification. In: Proceeding of european conference on computer vision (ECCV)

  37. Plutchik R (2001) The nature of emotions. Am Sci 89(4):344–350

    Article  Google Scholar 

  38. Saif H, Fernandez M, He Y, Alani H (2013) Evaluation datasets for Twitter sentiment analysis. In: Proceeding of AI IA emotion and sentiment in social and expressive media (ESSEM)

  39. Saif H, He Y, Alani H (2012) Semantic sentiment analysis of twitter. In: Proceeding of international conference on the semantic web (ISWC)

  40. Serra G, Alisi T, Bertini M, Ballan L, Del Bimbo A, Goix L, Licciardi C (2013) STAMAT: A framework for social topics and media analysis. In: Proceeding of IEEE international conference on multimedia and expo workshops (ICMEW), pp 1–2. doi:10.1109/ICMEW.2013.6618227

  41. Thelwall M, Buckley K, Paltoglou G, Cai D, Kappas A (2010) Sentiment strength detection in short informal text. J Am Soc Inf Sci Technol 61(12):2544–2558

    Article  Google Scholar 

  42. Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with Twitter: What 140 characters reveal about political sentiment. In: Proceeding of AAAI international conference on weblogs and social media (ICWSM)

  43. Turian J, Ratinov L, Bengio Y (2010) Word representations: a simple and general method for semi-supervised learning. In: Proceeding of ACL annual meeting of the association for computational linguistics

  44. Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceeding of international conference on machine learning (ICML), pp 1096–1103. doi:10.1145/1390156.1390294

  45. Wang M, Cao D, Li L, Li S, Ji R (2014) Microblog sentiment analysis based on cross-media bag-of-words model. In: Proceeding of international conference on internet multimedia computing and service (ICIMCS), pp 76:76–76:80. doi:10.1145/2632856.2632912

  46. Wang W, He Q (2008) A survey on emotional semantic image retrieval. In: Proceeding of IEEE international conference on image processing (ICIP), pp 117–120. doi:10.1109/ICIP.2008.4711705

  47. Wang Z, Cui P, Xie L, Chen H, Zhu W, Yang S (2012) Analyzing social media via event facets. In: Proceeding of ACM international conference on multimedia (MM), pp 1359–1360. doi:10.1145/2393347.2396484

  48. Yanai K (2012) World Seer: A realtime geo-tweet photo mapping system. In: Proceeding of ACM international conference on multimedia retrieval (ICMR), pp 65:1–65:2. doi:10.1145/2324796.2324870

  49. Yang Y, Cui P, Zhu W, Zhao HV, Shi Y, Yang S (2014) Emotionally representative image discovery for social events. In: Proceeding of ACM international conference on multimedia retrieval (ICMR), pp 177:177–177:184. doi:10.1145/2578726.2578749

  50. Zhao X, Zhu F, Qian W, Zhou A (2012) Impact of multimedia in Sina Weibo: Popularity and life span. In: Proceeding of chinese semantic web symposium and the first chinese web science conference (CSWS & CWSC)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tiberio Uricchio.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Baecchi, C., Uricchio, T., Bertini, M. et al. A multimodal feature learning approach for sentiment analysis of social network multimedia. Multimed Tools Appl 75, 2507–2525 (2016). https://doi.org/10.1007/s11042-015-2646-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-015-2646-x

Keywords

Navigation