Summarization of Twitter Events with Deep Neural Network Pre-trained Models

Chakma, Kunal; Das, Amitava; Debbarma, Swapan

doi:10.1007/978-3-030-76228-5_4

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1410))

Included in the following conference series:

Annual International Conference on Information Management and Big Data

940 Accesses

Abstract

Due to the proliferation of online social media services such as Twitter, there is an upsurge in the volume of user-generated textual content. Such voluminous content is difficult to be consumed by users. Therefore, the development of technological solutions to automatically summarise the voluminous texts are essential. The work presented in this paper reports on the development of automatically generating abstractive summaries from a collection of texts from Twitter. Our proposed approach is a two-stage framework which includes: 1) Event detection by clustering and 2) Summarization of the events. We first generated a contextualized vector representation of the tweets and then applied different clustering techniques on the vectors. We evaluated the generated clusters, and based on the evaluation; we chose the best one found suitable for the summarization task. For the summarization task, we used the pre-trained models of two recently developed state-of-the-art deep neural network architectures and evaluated them on the event clusters. Standard measures of ROUGE scores have been used for evaluating the summaries. We obtained best ROUGE-1 score of 46%, ROUGE-2 score of 30%, ROUGE-L score of 41% and ROUGE-SU score of 23% from our experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Information Abstraction from Crises Related Tweets Using Recurrent Neural Network

Comparison of Short-Text Embeddings for Unsupervised Event Detection in a Stream of Tweets

5W1H-Based Semantic Segmentation of Tweets for Event Detection Using BERT

Notes

1.
http://www.ansi.org.
2.
https://www.internetlivestats.com/.
3.
https://en.wikipedia.org/wiki/2016_Indian_banknote_demonetisation.
4.
https://en.wikipedia.org/wiki/2016_United_States_presidential_election.
5.
https://en.wikipedia.org/wiki/Me_Too_movement_(India).
6.
https://bit.ly/33lKpTj.
7.
https://bit.ly/3dVJrQ4.
8.
https://github.com/Twitter4J/Twitter4J.
9.
We experimented with the different values of min_cluster_size but with 100 we got the best clustering.
10.
shorturl.at/aeOTW.
11.
shorturl.at/crtI4.

References

Manuel, J., Moreno, T.: Automatic Text Summarization. Wiley (2014)
Google Scholar
Hasan, M., Orgun, M.A., Schwitter, R.: A survey on real-time event detection from the Twitter data stream. Inf. Sci. 44(4), 443–463 (2017)
Article Google Scholar
Nallapati, R. Zhou, B., Santos, C.D., Gulcehre, C., Xiang, B.: Abstractive text summarization using sequence-to-sequence RNNs and beyond. Comput. Nat. Lang. Learn. (2016)
Google Scholar
Cheng, J., Lapata, M.: Neural summarization by extracting sentences and words. In: Proceedings of the $54^{th}$ Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 484–494 (2016)
Google Scholar
Zhou, Q., Yang, N., Wei, F., Huang, S., Zhou, M., Zhao, T.: Neural document summarization by jointly learning to score and select sentences. In: Proceedings of the $56^{th}$ Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 654–663 (2018)
Google Scholar
See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with pointer-generator networks. In: Proceedings of the $55^{th}$ Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 1073–1083 (2017)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: CoRR, (abs/1706.03762) (2017)
Google Scholar
Liu, Y., Lapata, M.: Text summarization with pretrained encoders. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, vol. 1, pp. 3730–3740 (2019)
Google Scholar
Erkan, G., Radev, D.R.: LexRank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. 22, 457–479 (2004)
Google Scholar
Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: Proceedings $24^{th}$ Annual International ACM SIGIR Conference Research and Development Information Retrival, September, pp. 19–25 (2001)
Google Scholar
Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2(2), 159–165 (1958)
Article MathSciNet Google Scholar
Radev, D.R., Hovy, E., McKeown, K.: Introduction to the special issue on summarization. Comput. Linguis. 28(4), 399–408 (2002)
Article Google Scholar
Nenkova, A., Vanderwende, L.: The impact of frequency on summarization. Microsoft Res., Redmond, Washington, DC, USA, Technical Report. MSR-TR-2005, vol. 101 (2005)
Google Scholar
He, Z., et al.: Document summarization based on data reconstruction. In: Proceedings $26^{th}$ AAAI Conference Artificial Intelligence, July, pp. 620–626 (2012)
Google Scholar
Rudra, K., Ghosh, S., Ganguly, N., Goyal, P., Ghosh, S.: Extracting situational information from microblogs during disaster events: a classification-summarization approach. In: Proceedings of $24^{th}$ ACM International Conference Information Knowledge Management, October, pp. 583–592 (2015)
Google Scholar
Inouye, D.: Multiple Post Microblog Summarization, University of Colorado at Colorado Springs (2010)
Google Scholar
Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms, New Orleans, LA, pp. 1027–1035 (2007)
Google Scholar
Sharifi, B., Hutton, M.-A., Kalita, J.: Summarizing microblogs automatically. In: Human Language Technologies: the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT, pp. 685–688 (2010)
Google Scholar
Beverungen, G., Kalita, J.: Evaluating methods for summarizing Twitter posts. In: Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM), Hong Kong, China, pp. 1–6 (2011)
Google Scholar
Tibshirani, R., Walther, G., Hastie, T. : Estimating the number of clusters in a data set via the gap statistic. J. Royal Stat. Soc. Ser B (Stat. Methodol.) 63(2), 411–423 (2001)
Google Scholar
Kaufmann, M.: Syntactic normalization of Twitter messages. In: Proceedings of International Conference on Natural Language Processing (ICON), Kharagpur, India (2010)
Google Scholar
Perez-Tellez, F., Pinto, D., Cardiff, J., Rosso, P.: On the difficulty of clustering company tweets. In: Proceedings of the 2nd International Workshop on Search and Mining User-Generated Contents, Toronto, Canada, pp. 92–102 (2010)
Google Scholar
Shou, L., Wang, Z., Chen, K., Chen, G.: Sumblr: continuous summarization of evolving tweet streams. In: Proceedings $36^{th}$ Int. ACM SIGIR Conference Research Development Information Retrival, August, pp. 533–542 (2013)
Google Scholar
Judd, J., Kalita, J.: Better Twitter summaries? In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Atlanta, GA, pp. 445–449 (2013)
Google Scholar
Nichols, J., Mahmud, J., Drews, C.: Summarizing sporting events using Twitter. In: Proceedings of the ACM International Conference on Intelligent User Interfaces, New York, NY, pp. 189–198 (2012)
Google Scholar
Harabagiu, S., Hickl, A.: Relevance modeling for microblog summarization. In: Proceedings of the 5th International Conference on Weblogs and Social Media (ICWSM), Barcelona, Spain, pp. 514–517 (2011)
Google Scholar
Garg, N., Favre, B., Reidhammer, K., Hakkani-Tur, D.: ClusterRank: a graph based method for meeting summarization. In: Proceedings of $10^{th}$ Annual Conference of International Speech Communication, pp. 1499–1502 (2009)
Google Scholar
Mihalcea, R., Tarau, P.: Textrank: bringing order into texts. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, pp. 404–411 (2004)
Google Scholar
Yang, X., Ghoting, A., Ruan, Y., Parthasarathy, S.: A framework for summarizing and analyzing twitter feeds. In: Proceedings of the $18^{th}$ ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012, pp. 370–378. ACM, New York (2012)
Google Scholar
Wang, Z., Shou, L., Chen, K., Chen, G., Mehrotra, S.: On summarization and timeline generation for evolutionary tweet streams. IEEE Trans. Knowl. Data Eng. 27(5), 1301–1315 (2015)
Article Google Scholar
Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: Proceedings of the 29th International Conference on Very Large Data Bases, Berlin, Germany, pp. 81–92 (2003)
Google Scholar
Niu, J., Zhao, Q., Chen, H., Atiquzzaman, M., Peng, F. : OnSeS: a novel online short text summarization based on BM25 and neural network. In: IEEE Global Communications Conference (GLOBECOM), Washington, DC, pp. 1–6 (2016)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, vol. 26 (2013)
Google Scholar
Amati, G.: BM25. Encyclopedia of Database Systems, pp. 257–260. Springer, Boston (2009). https://doi.org/10.1007/978-0-387-39940-9
$Do\breve{g}an$, E., Kaya, B.: Text summarization in social networks by using deep learning. In: 1st International Informatics and Software Engineering Conference (UBMYK), Ankara, Turkey. pp. 1–5 (2019)
Google Scholar
Chakma, K., Das, A., Debbarma, S.: Deep semantic role labeling for Tweets using 5W1H: who, what, when, where, why and how. Computación y Sistemas 23(3), 751–763 (2019). https://doi.org/10.13053/CyS-23-3-3253
Article Google Scholar
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Xin, J., Jiawei, H.: K-means clustering. In: Encyclopedia of Machine Learning and Data Mining, pp. 695–697 (2017)
Google Scholar
Campello, R.J.G.B., Moulavi, D., Sander, J.: Density-based clustering based on hierarchical density estimates. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013. LNCS (LNAI), vol. 7819, pp. 160–172. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37456-2_14
Chapter Google Scholar
Zepeda-Mendoza, M.L., Resendis-Antonio, O.: Hierarchical agglomerative clustering. In: Dubitzky, W., Wolkenhauer, O., Cho, K.H., Yokota, H. (eds.) Encyclopedia of Systems Biology, Springer, New York (2013). https://doi.org/10.1007/978-1-4419-9863-7
Brendan F.J., Delbert, D.: Clustering by passing messages between data points, pp. 972–976 (2007)
Google Scholar
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Neural Information Processing Systems (2014)
Google Scholar
Sherstinsky, A.: Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. In: CoRR, (abs/1808.03314) (2018)
Google Scholar
Vinyals, O., Fortunato, M., Jaitly, N.: Pointer networks. In: Neural Information Processing Systems (2015)
Google Scholar
Tu, Z., Lu, Z., Liu, Y., Liu, X., Li, H.: Modeling coverage for neural machine translation. In: Association for Computational Linguistics (2016)
Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: Proceedings of the ICLR Conference, San Diego, USA. pp. 1–15 (2015)
Google Scholar
Lin, C-Y.: ROUGE: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out, WAS 2004
Google Scholar
Nils, R., Iryna, G.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of Empirical Methods in Natural Language Processing. Association for Computational Linguistics (2019). http://arxiv.org/abs/1908.10084

Download references

Author information

Authors and Affiliations

National Institute of Technology Agartala, Agartala, 799046, Tripura, India
Kunal Chakma & Swapan Debbarma
Wipro AI Lab, Bangalore, India
Amitava Das

Authors

Kunal Chakma
View author publications
You can also search for this author in PubMed Google Scholar
Amitava Das
View author publications
You can also search for this author in PubMed Google Scholar
Swapan Debbarma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kunal Chakma .

Editor information

Editors and Affiliations

Stanford University, Stanford, CA, USA
Juan Antonio Lossio-Ventura
Visibilia, São Paulo, Brazil
Jorge Carlos Valverde-Rebaza
University of Valencia, Valencia, Spain
Eduardo Díaz
Universidad del Pacífico, Lima, Peru
Hugo Alatrista-Salas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chakma, K., Das, A., Debbarma, S. (2021). Summarization of Twitter Events with Deep Neural Network Pre-trained Models. In: Lossio-Ventura, J.A., Valverde-Rebaza, J.C., Díaz, E., Alatrista-Salas, H. (eds) Information Management and Big Data. SIMBig 2020. Communications in Computer and Information Science, vol 1410. Springer, Cham. https://doi.org/10.1007/978-3-030-76228-5_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-76228-5_4
Published: 12 May 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-76227-8
Online ISBN: 978-3-030-76228-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Summarization of Twitter Events with Deep Neural Network Pre-trained Models

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Information Abstraction from Crises Related Tweets Using Recurrent Neural Network

Comparison of Short-Text Embeddings for Unsupervised Event Detection in a Stream of Tweets

5W1H-Based Semantic Segmentation of Tweets for Event Detection Using BERT

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Summarization of Twitter Events with Deep Neural Network Pre-trained Models

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Information Abstraction from Crises Related Tweets Using Recurrent Neural Network

Comparison of Short-Text Embeddings for Unsupervised Event Detection in a Stream of Tweets

5W1H-Based Semantic Segmentation of Tweets for Event Detection Using BERT

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation