Abstract
Due to the increasing popularity of social media platforms, the amount of messages (posts) related to public events, especially posts sharing multimedia content, is steadily increasing. Sharing images can contribute to a rich and live coverage of the event. Yet, despite the value and interestingness of some posts, there is a lot of spam and redundancy, which makes it challenging to select the most important and characteristic posts for the event. In this work, we describe MGraph, a summarization framework that, given a set of social media posts about an event, selects a subset of shared images, simultaneously maximizing their relevance and minimizing their visual redundancy. MGraph employs a topic modelling technique based on different modalities to capture the relevance of posts to event topics, and a graph-based ranking algorithm to produce a diverse ranking of the selected high-relevance images. A user-centred evaluation on a dataset comprising a variety of real-world events demonstrates that MGraph considerably outperforms a number of state-of-the-art summarization algorithms in terms of relevance and diversity (25 and 7 % improvement respectively).
Similar content being viewed by others
Notes
In microblogging platforms, such a set is typically formed by considering all posts that are tagged with an event-specific hashtag. In practice, despite being tagged with the event hashtag, many of these posts are irrelevant with the event, as in the case of promotional or trolling posts.
The CPM algorithm discovers subgraphs with clique-like structure, often referred to as communities, but here they are referred to as cliques to distinguish them from the communities produced by the SCAN algorithm (Sect. 3.6).
References
Aiello LM, Petkos G, Martín CJ, Corney D, Papadopoulos S, Skraba R, Göker A, Kompatsiaris I, Jaimes A (2013) Sensing trending topics in twitter. IEEE Trans Multimed 15(6):1268–1282
Alonso O, Shiells K (2013) Timelines as summaries of popular scheduled events. In: Proceedings of the 22nd International Conference on World Wide Web (WWW) companion, pp 1037–1044
Bian J, Yang Y, Chua TS (2013) Multimedia summarization for trending topics in microblogs. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management., CIKM ’13ACM, New York, NY, USA, pp 1807–1812
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
Carbonell J, Goldstein J (1998) The use of mmr, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pp 335–336. ACM
Celebrating #SB48 on Twitter. https://blog.twitter.com/2014/celebrating-sb48-on-twitter (2014). Accessed 27 Feb 2014
Chakrabarti D, Punera K (2011) Event summarization using tweets. In: Proceedings of 6th AAAI International Conference on Weblogs and Social Media (ICWSM)
Charikar MS (2002) Similarity estimation techniques from rounding algorithms. In: Proceedings of the 34th Annual ACM Symposium on Theory of Computing., STOC ’02ACM, New York, NY, USA, pp 380–388
Chua FCT, Asur S (2013) Automatic summarization of events from social media. In: Proceedings of 8th AAAI International Conference on Weblogs and Social Media (ICWSM)
Clarke CL, Kolla M, Cormack GV, Vechtomova O, Ashkan A, Büttcher S, MacKinnon I (2008) Novelty and diversity in information retrieval evaluation. In: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pp 659–666. ACM
Corney D, Goker A, Martin C, Papadopoulos S, Mantziou E, Spyromitros-Xioufis E, Schinas M, Iliakopoulou K, Mironidis T, Tsampoulatidis Y, Kompatsiaris Y, Mass Y, Aiello LM (2013) D4.3: Social media indexing, aggregation and retrieval. Tech Rep SocialSensor. http://socialsensor.eu/images/D4.3.pdf
Dork M, Gruen D, Williamson C, Carpendale S (2010) A visual backchannel for large-scale events. IEEE Trans Vis Comput Graph 16(6):1129–1138
Erkan G, Radev DR (2004) Lexrank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22(1):457–479
Fergus R, Weiss Y, Torralba A (2009) Semi-supervised learning in gigantic image collections. Adv Neural Inf Process Syst 22:522–530
Guille A, Favre C (2014) Mention-anomaly-based event detection and tracking in twitter. In: 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp 375–382. IEEE
Jegou H, Douze M, Schmid C (2011) Product quantization for nearest neighbor search. IEEE Trans Pattern Anal Mach Intell 33(1):117–128
Jegou H, Perronnin F, Douze M, Sánchez J, Perez P, Schmid C (2012) Aggregating local image descriptors into compact codes. IEEE Trans Pattern Anal Mach Intell 34(9):1704–1716
Lin C, Lin C, Li J, Wang D, Chen Y, Li T (2012) Generating event storylines from microblogs. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management., CIKM ’12ACM, New York, NY, USA, pp 175–184
Lu Y, Zhang P, Cao Y, Hu Y, Guo L (2014) On the frequency distribution of retweets. Procedia Comput Sci 31:747–753
Mantziou E, Papadopoulos S, Kompatsiaris Y (2015) Learning to detect concepts with approximate laplacian eigenmaps in large-scale and online settings. IJMIR 4(2):95–111
Marcus A, Bernstein MS, Badar O, Karger DR, Madden S, Miller RC (2011) TwitInfo: aggregating and visualizing microblogs for event exploration. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems., CHI ’11ACM, New York, NY, USA, pp 227–236
McMinn AJ, Moshfeghi Y, Jose JM (2013) Building a large-scale corpus for evaluating event detection on twitter. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge management, pp 409–418. ACM
McParlane PJ, McMinn AJ, Jose JM (2014) Picture the scene: visually summarising social media events. In: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management, pp 1459–1468. ACM
Mei Q, Guo J, Radev D (2010) Divrank: the interplay of prestige and diversity in information networks. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’10ACM, New York, NY, USA, pp 1009–1018
Nichols J, Mahmud J, Drews C (2012) Summarizing sporting events using twitter. In: Proceedings of the 2012 ACM International Conference on Intelligent User Interfaces, IUI ’12ACM, New York, NY, USA, pp 189–198
Page L, Brin S, Motwani R, Winograd T (1999) The pagerank citation ranking: bringing order to the web
Palla G, Derényi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043):814–818
Radev DR, Jing H, Styś M, Tam D (2004) Centroid-based summarization of multiple documents. Inf Process Manag 40(6):919–938
Schinas M, Papadopoulos S, Kompatsiaris Y, Mitkas PA (2015) Visual event summarization on social media using topic modelling and graph-based ranking algorithms. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp 203–210. ACM
Shen C, Liu F, Weng F, Li T (2013) A participant-based approach for event summarization using twitter streams. In: Proceedings of NAACL-HLT, pp 1152–1162
Spyromitros-Xioufis E, Papadopoulos S, Kompatsiaris IY, Tsoumakas G, Vlahavas I (2014) A comprehensive study over VLAD and product quantization in large-scale image retrieval. IEEE Trans Multimed 16(6):1713–1728
Wang D, Li T, Ogihara M (2012) Generating pictorial storylines via minimum-weight connected dominating set approximation in multi-view graphs. In: AAAI. Citeseer
Xu X, Yuruk N, Feng Z, Schweiger TAJ (2007) Scan: a structural clustering algorithm for networks. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’07ACM, New York, NY, USA, pp 824–833
Yajuan D, Zhimin C, Furu W, Ming Z, Shum HY (2012) Twitter topic summarization by ranking tweets using social influence and content quality. In: Proceedings of the 24th International Conference on Computational Linguistics, pp 763–780
Yang Z, Guo J, Cai K, Tang J, Li J, Zhang L, Su Z (2010) Understanding retweeting behaviors in social networks. In: Proceedings of the 19th ACM international conference on Information and knowledge management, pp 1633–1636. ACM
Zhao WX, Jiang J, Weng J, He J, Lim EP, Yan H, Li X (2011) Comparing twitter and traditional media using topic models. In: Advances in Information Retrieval, pp 338–349. Springer, New York
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by the REVEAL project, partially funded by the European Commission, contract FP7-610928.
Rights and permissions
About this article
Cite this article
Schinas, M., Papadopoulos, S., Kompatsiaris, Y. et al. MGraph: multimodal event summarization in social media using topic models and graph-based ranking. Int J Multimed Info Retr 5, 51–69 (2016). https://doi.org/10.1007/s13735-015-0089-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13735-015-0089-9