Abstract
There is a phenomenal growth of microblogging-based social communication services and subscriptions in recent years. Through these services, users publish a large number of posts within a short period time, making it extremely hard for readers to keep track of a trending topic. A solution to this issue is text summarisation, which can generate a short summary of a trending topic from multiple posts. Most of the existing summarisation algorithms were proposed for long documents and do not work well for short microblogging posts. The PR (Phrase Reinforcement) algorithm was particularly designed to summarise microblogs, however it is merely able to generate a single-post summary that conveys a single topic, potentially overlooking other important information from the posts. In this paper, we contribute the PRICE (Phrase Reinforcement: Iteration, Clustering and Extraction) algorithm by extending the original PR algorithm with the ability to generate both multi-post and single-post summaries that span over multiple subtopics. Experimental evaluation results show that the PRICE algorithm outperforms the original PR algorithm in terms of both ROUGE-1 and Content metrics.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
References
Balahur, A., Lloret, E., Boldrini, E., Montoyo, A., Palomar, M., Martínez-Barco, P.: Summarizing threads in blogs using opinion polarity. In: Proceedings of the Workshop on Events in Emerging Text Types, pp. 23–31 (2009)
Baxendale, P.B.: Machine-made index for technical literature: an experiment. IBM J. Res. Dev. 2(4), 354–361 (1958)
Carenini, G., Ng, R.T., Zhou, X.: Summarizing email conversations with clue words. In: Proceedings of the 16th International Conference on World Wide Web, pp. 91–100 (2007)
Chua, A.Y., Banerjee, S.: Customer knowledge management via social media: the case of Starbucks. J. Knowl. Manag. 17(2), 237–249 (2013)
Chuang, W.T., Yang, J.: Extracting sentence segments for text summarization: a machine learning approach. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 152–159 (2000)
DeVoe, K.M.: Bursts of information: microblogging. Ref. Libr. 50(2), 212–214 (2009)
Doughty, M., Rowland, D., Lawson, S.: Co-viewing live TV with digital backchannel streams. In: Proceedings of the 9th International Interactive Conference on Interactive Television, pp. 141–144 (2011)
Ebner, M., Lienhardt, C., Rohs, M., Meyer, I.: Microblogs in higher education - a chance to facilitate informal and process-oriented learning? Comput. Educ. 55(1), 92–100 (2010)
Edmundson, H.P.: New methods in automatic extracting. J. ACM 16(2), 264–285 (1969)
Hartigan, J.A., Wong, M.A.: Algorithm as 136: a k-means clustering algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 100–108 (1979)
Java, A., Song, X., Finin, T., Tseng, B.: Why we Twitter: understanding microblogging usage and communities. In: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis, pp. 56–65 (2007)
Jiranantanagorn, P., Shen, H., Goodwin, R., Teoh, K.K.: Classense: a mobile digital backchannel system for monitoring class morale. Int. J. Learn. Teach. 1(2), 161–167 (2015)
Jones, K.S.: Automatic summarising: the state of the art. Inf. Process. Manag. 43(6), 1449–1481 (2007)
Krikorian, R.: New tweets per second record, and how! (2013). https://blog.twitter.com/2013/new-tweets-per-second-record-and-how. 16 August 2013
Lin, C.Y., Hovy, E.: Automatic evaluation of summaries using n-gram co-occurrence statistics. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 71–78 (2003)
Louis, A., Nenkova, A.: Automatically assessing machine summary content without a gold standard. Comput. Linguist. 39(2), 267–300 (2013)
Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2(2), 159–165 (1958)
Mackie, S., McCreadie, R., Macdonald, C., Ounis, I.: Comparing algorithms for microblog summarisation. In: Kanoulas, E., Lupu, M., Clough, P., Sanderson, M., Hall, M., Hanbury, A., Toms, E. (eds.) CLEF 2014. LNCS, vol. 8685, pp. 153–159. Springer, Heidelberg (2014). doi:10.1007/978-3-319-11382-1_15
Nam, T.: Suggesting frameworks of citizen-sourcing via government 2.0. Gov. Inf. Q. 29(1), 12–20 (2012)
Nenkova, A., McKeown, K.: A survey of text summarization techniques. In: Aggarwal, C.C., Zhai, C. (eds.) Mining Text Data, pp. 43–76. Springer, New York (2012)
Nenkova, A., Passonneau, R., Mckeown, K.: The pyramid method: incorporating human content selection variation in summarization evaluation. ACM Trans. Speech Lang. Process. 4(2), Article 4 (2007)
Nichols, J., Mahmud, J., Drews, C.: Summarizing sporting events using Twitter. In: Proceedings of ACM International Conference on Intelligent User Interfaces, pp. 189–198 (2012)
Olariu, A.: Efficient online summarization of microblogging streams. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 236–240 (2014)
Regina, B., Elhadad, M.: Using lexical chains for text summarization. In: Advances in Automatic Text Summarization, pp. 111–121 (1999)
Ren, Z., Ma, J., Wang, S., Liu, Y.: Summarizing web forum threads based on a latent topic propagation process. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 879–884 (2011)
Rosa, K.D., Shah, R., Lin, B., Gershman, A., Frederking, R.: Topical clustering of tweets. In: Proceedings of the ACM SIGIR 3rd Workshop on Social Web Search and Mining (2011)
Sharifi, B., Hutton, M.A., Kalita, J.: Summarizing microblogs automatically. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the ACL, pp. 685–688 (2010)
Sharifi, B., Hutton, M.A., Kalita, J.K.: Experiments in microblog summarization. In: Proceedings of the IEEE Second International Conference on Social Computing, pp. 49–56 (2010)
Uvarova, N.: Abstractive microblogs summarization. Master’s thesis, Gjøvik University College (2015)
Vanderwende, L., Suzuki, H., Brockett, C., Nenkova, A.: Beyond sumbasic: task-focused summarization with sentence simplification and lexical expansion. Inf. Process. Manag. 43(6), 1606–1618 (2007)
Wu, Y., Zhang, H., Xu, B., Hao, H., Liu, C.: Automatic microblog summarization based on unsupervised key-bigram extraction. Int. J. Comput. Commun. Eng. 4(5), 363–370 (2015)
Zhang, Y.Z., Zincir-Heywood, N., Milios, E.: Summarizing web sites automatically. In: Xiang, Y., Chaib-draa, B. (eds.) AI 2003. LNCS, vol. 2671, pp. 283–296. Springer, Heidelberg (2003). doi:10.1007/3-540-44886-1_22
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Alghamdi, M., Shen, H. (2017). Automatic Clustering and Summarisation of Microblogs: A Multi-subtopic Phrase Reinforcement Algorithm. In: Wagner, M., Li, X., Hendtlass, T. (eds) Artificial Life and Computational Intelligence. ACALCI 2017. Lecture Notes in Computer Science(), vol 10142. Springer, Cham. https://doi.org/10.1007/978-3-319-51691-2_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-51691-2_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51690-5
Online ISBN: 978-3-319-51691-2
eBook Packages: Computer ScienceComputer Science (R0)