Abstract
Given a textual data stream related to an event, social event summarization aims to generate an informative textual description that can capture all the important moments, and it plays a critical role in mining and analyzing social media streams. In this paper, we present a general social event summarization framework using Twitter streams. The proposed framework consists of three key components: participant detection, sub-event detection, and summary tweet extraction. To make the system applicable in real data, an online clustering approach is developed for participant detection and an online temporal-content mixture model is proposed to conduct sub-event detection. Experiments show that the proposed framework can achieve similar performance with its batch counterpart.







Similar content being viewed by others
Notes
We use “participant sub-events” and “global sub-events” respectively to represent the important moments happened on the participant-level and on the entire event-level. A “global sub-event” may consist of one or more “participant sub-events”. For example., the “steal” action in the basketball game typically involves both the defensive and offensive players, and can be generated by merging the two participant-level sub-events.
We use the algorithm described in [27] as a baseline and ad hoc spike detection algorithm.
β was set to 5 minutes in our experiments.
β was set to 5 minutes in our experiments.
References
Allan, J.: Topic Detection and Tracking: Event-based Information Organization. Kluwer Academic Publishers Norwell, MA, USA (2002)
Ahlqvist, T., Beck, A., Halonen, M., Heinonen, S.: Social Media Roadmaps: Exploring the futures triggered by social media. VTT Tiedotteita - Research Notes (2454) (2008)
Atefeh, F., Khreich, W.: A survey of techniques for event detection in twitter. Comput. Intell., 132–164 (2013)
Bagga, A., Baldwin, B.: Algorithms for Scoring Coreference Chains. In: The First International Conference on Language Resources and Evaluation Workshop on Linguistics Coreference (1998)
Becker, H., Naaman, M., Gravano, L.: Beyond Trending Topics: Real-World Event Identification on Twitter. In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, pp. 438–441 (2011)
Chakrabarti, D., Punera, K.: Event Summarization Using Tweets. In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, pp. 66–73 (2011)
Chang, Y., Wang, X., Mei, Q., Liu, Y.: Towards Twitter Context Summarization with User Influence Models. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, pp. 527–536 (2013)
Diao, Q., Jiang, J., Zhu, F., Lim, E.: Finding Bursty Topics from Microblogs. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp. 536–544 (2012)
Daumé, IIIH., Marcu, D.: Bayesian Query-Focused Summarization. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, pp. 305–312 (2006)
Erkan, G., Radev, D.R.: Lexrank: Graph-based lexical centrality as salience in text summarization. JAIR 22(1), 457–479 (2004)
Guille, A., Favre, C.: Event detection, tracking, and visualization in Twitter: a mention-anomaly-based approach. Soc. Netw. Anal. Min., 18:1–18:18 (2015)
Goswami, A., Kumar, A.: A survey of event detection techniques in online social networks. Soc. Netw. Anal. Min., 107 (2016)
Guha, S., Khuller, S.: Approximation algorithms for connected dominating sets. Algorithmica 20(4), 374–387 (1998)
Goldstein, J., Mittal, V., Carbonell, J., Kantrowitz, M.: Multi-Document Summarization by Sentence Extraction. In: NAACL-ANLP 2000 Workshop on Automatic Summarization, Association for Computational Linguistics, pp. 40–48 (2000)
Hofmann, T.: Probabilistic Latent Semantic Indexing. In: Proceedings of the 22th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 50–57 (1999)
Hong, L., Dom, B., Gurumurthy, S., Tsioutsiouliklis, K.: A Time-dependent Topic Model for Multiple Text Streams. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 832–840 (2011)
He, R., Liu, Y., Yu, G., Tang, J., Hu, Q., Dang, J.: Twitter summarization with social-temporal context. World Wide Web, 1–24 (2017)
Haghighi, A., Vanderwende, L.: Exploring Content Models for Multi-Document Summarization. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, pp. 362–370 (2009)
Imran, M., Castillo, C., Diaz, F., Vieweg, S.: Processing Social Media Messages in Mass Emergency: A Survey. ACM Comput. Surv., 67:1–67:38 (2015)
Inouye, D., Kalita, J.K.: Comparing Twitter Summarization Algorithms for Multiple Post Summaries. In: Proceedings of 2011 IEEE third International Conference on Social Computing, pp. 290–306 (2011)
Jurafsky, D., Martin, J.: Speech and language processing. Prentice Hall, New York (2008)
L.L.: Measures of Distributional Similarity. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pp. 25–32 (1999)
Lin, C.-Y.: ROUGE: a Package for Automatic Evaluation of Summaries. In: Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, pp. 74–81 (2004)
Li, Z., Tang, J., Wang, X., Liu, J., Lu, H.: Multimedia News Summarization in Search, ACM Trans. Intell. Syst. Technol., 33:1–33:20 (2016)
Li, T., Xie, N., Zeng, C., Zhou, W., Zheng, L., Jiang, Y., Yang, Y., Ha, H.-Y., Xue, W., Huang, Y., Chen, S.-C., Navlakha, J., Iyengar, S.S.: Data-Driven Techniques in Disaster Information Management. ACM Comput. Surv. 50 (1), 1:1–1:45 (2017)
Mani, I.: Automatic summarization. Comput. Linguist. 28(2)
Marcus, A., Bernstein, M., Badar, O., Karger, D., Madden, S., Miller, R.: Twitinfo: Aggregating and Visualizing Microblogs for Event Exploration. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 227–236 (2011)
Nichols, J., Mahmud, J., Drews, C.: Summarizing Sporting Events Using Twitter. In: Proceedings of the 2012 ACM Interntional Conference on Intelligent User Interfaces, pp. 189–198 (2012)
Nenkova, A., Vanderwende, L.: The impact of frequency on summarization. Microsoft Research, Redmond, Washington, Tech. Rep. MSR-TR-2005-101
Purushotham, S., Kuo, C.-C.J.: Personalized Group Recommender Systems for Location- and Event-Based Social Networks. ACM Trans. Web, 16:1–16:29 (2016)
Peng, B., Li, J., Chen, J., Han, X., Xu, R., Wong, K.F.: Trending Sentiment-Topic detection on twitter. Springer International Publishing, 66–77 (2015)
Petrovic, S., Osborne, M., Lavrenko, V.: Streaming First Story Detection with Application to Twitter. In: Proceedings of the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 181–189 (2010)
Ritter, A., Clark, S., Mausam, O.: Etzioni, Named Entity Recognition in Tweets: an Experimental Study. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1524–1534 (2011)
Radev, D., Jing, H., Styś, M., Tam, D.: Centroid-based summarization of multiple documents. Inf. Process. Manag. 40(6), 919–938 (2004)
Ritter, A., Mausam, O., Etzioni, S.: Clark, Open Domain Event Extraction from Twitter. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1104–1112 (2012)
Shen, C., Li, T.: Multi-Document Summarization via the Minimum Dominating Set. In: Proceedings of the 23rd International Conference on Computational Linguistics, Association for Computational Linguistics, pp. 984–992 (2010)
Shen, C., Liu, F., Weng, F., Li, T.: A Participant-based Approach for Event Summarization Using Twitter Streams. In: Proceedings of NAACL-HLT, pp. 1152–1162 (2013)
Tang, J., Yao, L., Chen, D.: Multi-Topic Based Query-oriented Summarization. In: Proceedings of SDM, pp. 1147–1158 (2009)
Takamura, H., Yokono, H., Okumura, M.: Summarizing a Document Stream. In: Proceedings of the 33rd European Conference on Advances in Information Retrieval, pp. 177–188 (2011)
Unankard, S., Li, X., Sharaf, M.A.: Emerging event detection in social networks with location sensitivity. World Wide Web, 1393–1417 (2015)
Wan, X.: Topic Analysis for Topic-Focused Multi-Document Summarization. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1609–1612. ACM (2009)
Weng, J., Lee, B.-S.: Event Detection in Twitter. In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, pp. 401–408 (2011)
Wang, D., Li, T., Zhu, S., Ding, C.: Multi-Document Summarization via Sentence-Level Semantic Analysis and Symmetric Matrix Factorization. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 307–314. ACM (2008)
Wan, X., Yang, J., Xiao, J.: Towards an Iterative Reinforcement Approach for Simultaneous Document Summarization and Keyword Extraction. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, pp. 543–552 (2007)
Xue, W., Li, T., Rishe, N.: Aspect identification and ratings inference for hotel reviews. World Wide Web 20(1), 23–37 (2017)
Yamamoto, Y., Shinozaki, T., Ikegami, Y., Tsuruta, S.: Context respectful counseling agent virtualized on the Web. World Wide Web, 1–24 (2015)
Zhang, X., Li, Z., Zhu, S., Liang, W.: Detecting Spam and Promoting Campaigns in Twitter. ACM Trans. Web, 4:1–4:28 (2016)
Zubiaga, A., Spina, D., Amigó, E., Gonzalo, J.: Towards Real-time Summarization of Scheduled Events from Twitter Streams. In: Proceedings of the 23Rd ACM Conference on Hypertext and Social Media, pp. 319–320 (2012)
Zhao, S., Zhong, L., Wickramasuriya, J., Vasudevan, V., LiKamWa, R., Rahmati, A.: SportSense: Real-Time Detection of NFL Game Events from Twitter. Technical Report TR0511-2012
Zhao, S., Zhong, L., Wickramasuriya, J., Vasudevan, V.: Human as Real-Time Sensors of Social and Physical Events: A Case Study of Twitter and Sports Games. Technical Report TR0620-2011, Rice University and Motorola Labs
Acknowledgements
The work was supported in part by the National Science Foundation under Grant Nos. IIS-1213026, CNS-1126619, and CNS-1461926, Chinese National Natural Science Foundation under grant 91646116, Ministry of Education/China Mobile joint research grant under Project No.5-10, and Scientific and Technological Support Project (Society) of Jiangsu Province (No. BE2016776).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Huang, Y., Shen, C. & Li, T. Event summarization for sports games using twitter streams. World Wide Web 21, 609–627 (2018). https://doi.org/10.1007/s11280-017-0477-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-017-0477-6