Skip to main content
Log in

MGraph: multimodal event summarization in social media using topic models and graph-based ranking

  • Regular Paper
  • Published:
International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

Abstract

Due to the increasing popularity of social media platforms, the amount of messages (posts) related to public events, especially posts sharing multimedia content, is steadily increasing. Sharing images can contribute to a rich and live coverage of the event. Yet, despite the value and interestingness of some posts, there is a lot of spam and redundancy, which makes it challenging to select the most important and characteristic posts for the event. In this work, we describe MGraph, a summarization framework that, given a set of social media posts about an event, selects a subset of shared images, simultaneously maximizing their relevance and minimizing their visual redundancy. MGraph employs a topic modelling technique based on different modalities to capture the relevance of posts to event topics, and a graph-based ranking algorithm to produce a diverse ranking of the selected high-relevance images. A user-centred evaluation on a dataset comprising a variety of real-world events demonstrates that MGraph considerably outperforms a number of state-of-the-art summarization algorithms in terms of relevance and diversity (25 and 7 % improvement respectively).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. In microblogging platforms, such a set is typically formed by considering all posts that are tagged with an event-specific hashtag. In practice, despite being tagged with the event hashtag, many of these posts are irrelevant with the event, as in the case of promotional or trolling posts.

  2. The CPM algorithm discovers subgraphs with clique-like structure, often referred to as communities, but here they are referred to as cliques to distinguish them from the communities produced by the SCAN algorithm (Sect. 3.6).

  3. http://www.crowdflower.com/.

  4. http://www.ark.cs.cmu.edu/TweetNLP.

  5. https://github.com/MKLab-ITI/multimedia-indexing.

  6. https://github.com/MKLab-ITI/mgraph-summarization.

References

  1. Aiello LM, Petkos G, Martín CJ, Corney D, Papadopoulos S, Skraba R, Göker A, Kompatsiaris I, Jaimes A (2013) Sensing trending topics in twitter. IEEE Trans Multimed 15(6):1268–1282

    Article  Google Scholar 

  2. Alonso O, Shiells K (2013) Timelines as summaries of popular scheduled events. In: Proceedings of the 22nd International Conference on World Wide Web (WWW) companion, pp 1037–1044

  3. Bian J, Yang Y, Chua TS (2013) Multimedia summarization for trending topics in microblogs. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management., CIKM ’13ACM, New York, NY, USA, pp 1807–1812

  4. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  5. Carbonell J, Goldstein J (1998) The use of mmr, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pp 335–336. ACM

  6. Celebrating #SB48 on Twitter. https://blog.twitter.com/2014/celebrating-sb48-on-twitter (2014). Accessed 27 Feb 2014

  7. Chakrabarti D, Punera K (2011) Event summarization using tweets. In: Proceedings of 6th AAAI International Conference on Weblogs and Social Media (ICWSM)

  8. Charikar MS (2002) Similarity estimation techniques from rounding algorithms. In: Proceedings of the 34th Annual ACM Symposium on Theory of Computing., STOC ’02ACM, New York, NY, USA, pp 380–388

  9. Chua FCT, Asur S (2013) Automatic summarization of events from social media. In: Proceedings of 8th AAAI International Conference on Weblogs and Social Media (ICWSM)

  10. Clarke CL, Kolla M, Cormack GV, Vechtomova O, Ashkan A, Büttcher S, MacKinnon I (2008) Novelty and diversity in information retrieval evaluation. In: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pp 659–666. ACM

  11. Corney D, Goker A, Martin C, Papadopoulos S, Mantziou E, Spyromitros-Xioufis E, Schinas M, Iliakopoulou K, Mironidis T, Tsampoulatidis Y, Kompatsiaris Y, Mass Y, Aiello LM (2013) D4.3: Social media indexing, aggregation and retrieval. Tech Rep SocialSensor. http://socialsensor.eu/images/D4.3.pdf

  12. Dork M, Gruen D, Williamson C, Carpendale S (2010) A visual backchannel for large-scale events. IEEE Trans Vis Comput Graph 16(6):1129–1138

    Article  Google Scholar 

  13. Erkan G, Radev DR (2004) Lexrank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22(1):457–479

    Google Scholar 

  14. Fergus R, Weiss Y, Torralba A (2009) Semi-supervised learning in gigantic image collections. Adv Neural Inf Process Syst 22:522–530

    Google Scholar 

  15. Guille A, Favre C (2014) Mention-anomaly-based event detection and tracking in twitter. In: 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp 375–382. IEEE

  16. Jegou H, Douze M, Schmid C (2011) Product quantization for nearest neighbor search. IEEE Trans Pattern Anal Mach Intell 33(1):117–128

    Article  Google Scholar 

  17. Jegou H, Perronnin F, Douze M, Sánchez J, Perez P, Schmid C (2012) Aggregating local image descriptors into compact codes. IEEE Trans Pattern Anal Mach Intell 34(9):1704–1716

    Article  Google Scholar 

  18. Lin C, Lin C, Li J, Wang D, Chen Y, Li T (2012) Generating event storylines from microblogs. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management., CIKM ’12ACM, New York, NY, USA, pp 175–184

  19. Lu Y, Zhang P, Cao Y, Hu Y, Guo L (2014) On the frequency distribution of retweets. Procedia Comput Sci 31:747–753

    Article  Google Scholar 

  20. Mantziou E, Papadopoulos S, Kompatsiaris Y (2015) Learning to detect concepts with approximate laplacian eigenmaps in large-scale and online settings. IJMIR 4(2):95–111

    Google Scholar 

  21. Marcus A, Bernstein MS, Badar O, Karger DR, Madden S, Miller RC (2011) TwitInfo: aggregating and visualizing microblogs for event exploration. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems., CHI ’11ACM, New York, NY, USA, pp 227–236

  22. McMinn AJ, Moshfeghi Y, Jose JM (2013) Building a large-scale corpus for evaluating event detection on twitter. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge management, pp 409–418. ACM

  23. McParlane PJ, McMinn AJ, Jose JM (2014) Picture the scene: visually summarising social media events. In: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management, pp 1459–1468. ACM

  24. Mei Q, Guo J, Radev D (2010) Divrank: the interplay of prestige and diversity in information networks. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’10ACM, New York, NY, USA, pp 1009–1018

  25. Nichols J, Mahmud J, Drews C (2012) Summarizing sporting events using twitter. In: Proceedings of the 2012 ACM International Conference on Intelligent User Interfaces, IUI ’12ACM, New York, NY, USA, pp 189–198

  26. Page L, Brin S, Motwani R, Winograd T (1999) The pagerank citation ranking: bringing order to the web

  27. Palla G, Derényi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043):814–818

    Article  Google Scholar 

  28. Radev DR, Jing H, Styś M, Tam D (2004) Centroid-based summarization of multiple documents. Inf Process Manag 40(6):919–938

    Article  MATH  Google Scholar 

  29. Schinas M, Papadopoulos S, Kompatsiaris Y, Mitkas PA (2015) Visual event summarization on social media using topic modelling and graph-based ranking algorithms. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp 203–210. ACM

  30. Shen C, Liu F, Weng F, Li T (2013) A participant-based approach for event summarization using twitter streams. In: Proceedings of NAACL-HLT, pp 1152–1162

  31. Spyromitros-Xioufis E, Papadopoulos S, Kompatsiaris IY, Tsoumakas G, Vlahavas I (2014) A comprehensive study over VLAD and product quantization in large-scale image retrieval. IEEE Trans Multimed 16(6):1713–1728

    Article  Google Scholar 

  32. Wang D, Li T, Ogihara M (2012) Generating pictorial storylines via minimum-weight connected dominating set approximation in multi-view graphs. In: AAAI. Citeseer

  33. Xu X, Yuruk N, Feng Z, Schweiger TAJ (2007) Scan: a structural clustering algorithm for networks. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’07ACM, New York, NY, USA, pp 824–833

  34. Yajuan D, Zhimin C, Furu W, Ming Z, Shum HY (2012) Twitter topic summarization by ranking tweets using social influence and content quality. In: Proceedings of the 24th International Conference on Computational Linguistics, pp 763–780

  35. Yang Z, Guo J, Cai K, Tang J, Li J, Zhang L, Su Z (2010) Understanding retweeting behaviors in social networks. In: Proceedings of the 19th ACM international conference on Information and knowledge management, pp 1633–1636. ACM

  36. Zhao WX, Jiang J, Weng J, He J, Lim EP, Yan H, Li X (2011) Comparing twitter and traditional media using topic models. In: Advances in Information Retrieval, pp 338–349. Springer, New York

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manos Schinas.

Additional information

This work was supported by the REVEAL project, partially funded by the European Commission, contract FP7-610928.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Schinas, M., Papadopoulos, S., Kompatsiaris, Y. et al. MGraph: multimodal event summarization in social media using topic models and graph-based ranking. Int J Multimed Info Retr 5, 51–69 (2016). https://doi.org/10.1007/s13735-015-0089-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13735-015-0089-9

Keywords

Navigation