Skip to main content
Log in

Event detection from real-time twitter streaming data using community detection algorithm

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The increasing popularity of social media services has led to more and more people using Twitter. There are millions of tweets with a high amount of noisy data that propagate daily on the Internet. Twitter acts as a source of information for events and breaking news. However, it is very challenging for any person to extract useful information related to important events manually, from the end- less stream of tweets. Hence, it is desired to automate the whole process of event detection, so that important events can be identified in real-time from a stream of tweets, as early as possible, after the actual happening. Most of the existing approaches are more focussed on “What happened”. To define any event, answers of “When” and “Where” are also required. To handle emergency events, location and time parameters play a very important role. This article proposes a faster location based event detection approach without compromising accuracy, which automatically extracts separate clusters concerning local or global events from real-time streaming data. The proposed approach consists of four major steps. In the first step, a new dynamic weighting scheme named Conditional Term Frequency-Average Inverse Window Frequency (CTF-AIWF) based on TF-IDF is proposed to capture emerging keywords from the temporal dynamics of data. Next, a new clustering algorithm named Edge Significance based Louvain Algorithm (ESBLA) is proposed to group the same event keywords. This clustering helps in improving the run-time performance up to 50% while maintaining the quality performance (F1-score) comparable to the baseline models. In the third step, a new content-based location detection technique is proposed to detect the location of the event. This technique is able to handle various issues like use of informal text, short form of a text, and misspelled keywords of microblogging data. Finally, Google Map is used to visualize the events in happening locations. This step makes the decision faster regarding the detected events. For the experimentation, tweets are collected in real-time and stored in MongoDB NoSQL database for processing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

Data will be available on reasonable request.

Notes

  1. https://www.dsayce.com/social-media/tweets-day/(25May,2023)

  2. https://www.dsayce.com/social-media/tweets-day/(15June2023)

  3. https://www.statista.com/statistics/282087/number-of-monthly-active-twitter-users/(15June2023)

  4. https://storyful.com/

  5. https://www.dataminr.com/

  6. https://github.com/hiiamrohit/Countries-States-Cities-database

  7. https://github.com/pbugnion/gmaps

  8. http://keygraph.codeplex.com

  9. https://github.com/heerme

References

  1. Abdelhaq H, Gertz M, Armiti A (2017) Efficient online extraction of keywords for localized events in twitter. GeoInformatica 21(2):365–388

    Article  Google Scholar 

  2. Ahmed S, Jaidka K, Cho J (2016) The 2014 indian elections on twitter: a compari- son of campaign strategies of political parties. Telematics Inform 33(4):1071–1087

    Article  Google Scholar 

  3. Akhgari Z, Malekimajd M, Rahmani H (2022) Sem-ted: semantic twitter event detection and adapting with news stories. In: 2022 8th international conference on web research (ICWR). IEEE, pp 61–69

  4. Akhgari Z, Malekimajd M, Rahmani H (2022) Tedgram: twitter event detec- tion using graphbased methods. In: 2022 8th international conference on web research (ICWR). IEEE, pp 16–23

  5. Allan J (2002) Introduction to topic detection and tracking. In Topic detection and tracking: Event-based information organization. Springer US, Boston, MA, pp 1–16

  6. Alomari E, Katib I, Albeshri A, Mehmood R (2021) Covid-19: detecting govern- ment pandemic measures and public concerns from twitter arabic data using distributed machine learning. Int J Environ Res Public Health 18(1):282

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Bhuvaneswari A, Jayanthi R, Meena AL (2021) Improving crisis event detection rate in online social networks twitter stream using apache spark. J Phys Conf Ser 1950:012077

  8. Blei, DM, Lafferty, JD (2006) Dynamic topic models. In: proceedings of the 23rd international conference on machine learning, pp. 113–120. ACM

  9. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(Jan):993–1022

    Google Scholar 

  10. Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10):10008

    Article  Google Scholar 

  11. Choi H-J, Park CH (2019) Emerging topic detection in twitter stream based on high utility pattern mining. Expert Syst Appl 115:27–36

    Article  Google Scholar 

  12. Dhiman A, Toshniwal D (2020) An approximate model for event detection from twitter data. IEEE Access 8:122168–122184

    Article  Google Scholar 

  13. Fang Y, Gao J, Liu Z, Huang C (2020) Detecting cyber threat event from twitter using idcnn and bilstm. Appl Sci 10(17):5922

    Article  CAS  Google Scholar 

  14. Fedoryszak, M, Frederick, B, Rajaram, V, Zhong, C (2019) Real-time event detection on social data streams. In: proceedings of the 25th ACM SIGKDD international conference on Knowledge Discovery & Data Mining, pp. 2774–2782. ACM

  15. Feng, X, Zhang, S, Liang, W, Liu, J (2015) Efficient location-based event detection in social text streams. In: International conference on intelligent science and big data engineering, pp. 213–222. Springer

  16. Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174

    Article  ADS  MathSciNet  Google Scholar 

  17. Gaglio S, Re GL, Morana M (2016) A framework for real-time twitter data analysis. Comput Commun 73:236–242

    Article  Google Scholar 

  18. Ghaemi Z, Farnaghi M (2019) A varied density-based clustering approach for event detection from heterogeneous twitter data. ISPRS Int J Geo- Inf 8(2):82

    Article  Google Scholar 

  19. Giridhar, P, Abdelzaher, T., George, J, Kaplan, L (2015) On quality of event local- ization from social network feeds. In: Pervasive computing and communication workshops (PerCom workshops), 2015 IEEE international conference on, pp. 75–80. IEEE

  20. Girish, K, Moni, J, Roy, JG, Afreed, C, Harikrishnan, S, Kumar, GG (2022) Extreme event detection and management using twitter data analysis. In: 2022 international conference on decision aid sciences and applications (DASA), pp. 917–921. IEEE

  21. Guille, A, Favre, C (2014) Mention-anomaly-based event detection and tracking in twitter. In: Advances in social networks analysis and mining (ASONAM), 2014 IEEE/ACM international conference on, pp. 375–382. IEEE

  22. Hasan M, Orgun MA, Schwitter R (2016) Twitternews: real time event detection from the twitter data stream. PeerJ PrePrints 4:2297–2291

    Google Scholar 

  23. Hoffman, M, Bach, FR, Blei, DM (2010) Online learning for latent dirichlet allo- cation. In: Advances in Neural Information Processing Systems, pp. 856–864

  24. Hossny, AH, Mitchell, L (2018) Event detection in twitter: a keyword volume approach. In: 2018 IEEE international conference on data mining workshops (ICDMW), pp. 1200–1208. IEEE

  25. Hu, M, Liu, S, Wei, F, Wu, Y, Stasko, J, Ma, K-L (2012) Breaking news on twit- ter. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2751–2754

  26. Ifrim, G, Shi, B, Brigadir, I (2014) Event detection in twitter using aggressive filtering and hierarchical tweet clustering. In: Second workshop on social news on the web (SNOW), Seoul, Korea, 8 April 2014. ACM

  27. Janjua NK, Nawaz F, Prior DD (2023) A fuzzy supply chain risk assessment approach using real-time disruption event data from twitter. Enterp Inf Syst 17(4):1959652

    Article  Google Scholar 

  28. Kamoji S, Kalla M (2023) Effective flood prediction model based on twitter text and image analysis using bmlp and sdae-hhnn. Eng Appl Artif Intell 123:106365

    Article  Google Scholar 

  29. Karimi S, Shakery A, Verma RM (2023) Enhancement of twitter event detection using news streams. Nat Lang Eng 29(2):181–200

    Article  Google Scholar 

  30. Khan HU, Nasir S, Nasim K, Shabbir D, Mahmood A (2021) Twitter trends: a ranking algorithm analysis on real time data. Expert Syst Appl 164:113990

    Article  Google Scholar 

  31. Li, R, Lei, KH, Khadiwala, R, Chang, KC-C (2012) Tedas: a twitter-based event detection and analysis system. In: Data engineering (icde), 2012 Ieee 28th international conference on, pp. 1273–1276. IEEE

  32. Li, C, Sun, A, Datta, A (2012) Twevent: segment-based event detection from tweets. In: proceedings of the 21st ACM international conference on information and knowledge management, pp. 155–164. ACM

  33. McMinn, AJ, Moshfeghi, Y, Jose, JM (2013) Building a large-scale corpus for evalu- ating event detection on twitter. In: proceedings of the 22nd ACM international conference on Information & Knowledge Management, pp. 409–418. ACM

  34. Mehrotra, R, Sanner, S, Buntine, W, Xie, L (2013) Improving lda topic models for microblogs via tweet pooling and automatic labeling. In: proceedings of the 36th international ACM SIGIR conference on Research and Development in information retrieval, pp. 889–892. ACM

  35. Mojiri MM, Ravanmehr R (2020) Event detection in twitter using multi timing chained windows. Comput Inf 39(6):1336–1359

    Google Scholar 

  36. Newman ME (2004) Detecting community structure in networks. The Eur Phys J B 38(2):321–330

    Article  ADS  CAS  Google Scholar 

  37. Nguyen DT, Jung JE (2017) Real-time event detection for online behavioral analysis of big social data. Futur Gener Comput Syst 66:137–145

    Article  Google Scholar 

  38. Noori, MAR, Mehra, R (2020) Fire emergency detection from twitter using super- vised principal. In: 2020 IEEE 15th international conference on industrial and information systems (ICIIS), pp. 403–408. IEEE

  39. Osborne, M, Petrovic, S, McCreadie, R, Macdonald, C, Ounis, I (2012) Bieber no more: First story detection using twitter and wikipedia. In: SIGIR 2012 Workshop on Time-aware Information Access

  40. Ozdikis O, O˘guztüzün, H., Karagoz, P. (2016) Evidential estimation of event loca- tions in microblogs using the dempster–Shafer theory. Inf Process Manag 52(6):1227–1246

    Article  Google Scholar 

  41. Pandya, A, Oussalah, M, Kostakos, P, Fatima, U (2020) Mated: metadata-assisted twitter event detection system. In: information processing and Management of Uncertainty in knowledge-based systems: 18th international conference, IPMU 2020, Lisbon, Portugal, June 15–19, 2020, proceedings, part I 18, pp. 402–414. Springer

  42. Paul NR, Sahoo D, Balabantaray RC (2023) Classification of crisis-related data on twitter using a deep learning-based framework. Multimed Tools Appl 82(6):8921–8941

    Article  Google Scholar 

  43. Petrovíc S, Osborne, M, Lavrenko, V (2010) Streaming first story detection with application to twitter. In: Human Language Technologies: The 2010 Annual Con- ference of the North American Chapter of the Association for Computational Linguistics, pp. 181–189. Assoc Comput Linguist

  44. Qiu, X, Zou, Q, Richard Shi, C (2021) Single-pass on-line event detection in twit- ter streams. In: 2021 13th International Conference on Machine Learning and Computing, pp. 522–529

  45. Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 76(3):036106

    Article  ADS  Google Scholar 

  46. Rezaei, Z, Eslami, B, Amini, MA, Eslami, M (2022) Event detection in twitter by deep learning classification and multi label clustering virtual backbone formation. Evol Intel, 1–15

  47. Said, N, Ahmad, K, Gul, A, Ahmad, N, Al-Fuqaha, A (2020) Floods detection in twitter text and images. arXiv preprint arXiv:2011.14943

  48. Sakaki T, Okazaki M, Matsuo Y (2013) Tweet analysis for real-time event detection and earthquake reporting system development. IEEE Trans Knowl Data Eng 25(4):919–931

    Article  Google Scholar 

  49. Salza, D, Arnaudo, E, Blanco, G, Rossi, C (2022) A’glocal’approach for real-time emergency event detection in twitter. In: ISCRAM 2022 Conference Proceedings- 19th International Conference on Information Systems for Crisis Response and Management

  50. Sankaranarayanan, J, Samet, H, Teitler, BE, Lieberman, MD, Sperling, J (2009) Twitterstand: news in tweets. In: proceedings of the 17th Acm Sigspatial inter- national conference on advances in geographic information systems, pp. 42–51. ACM

  51. Sayyadi H, Raschid L (2013) A graph analytical approach for topic detection. ACM Trans Int Technol (TOIT) 13(2):4

    Google Scholar 

  52. Sayyadi, H, Hurst, M, Maykov, A (2009) Event detection and tracking in social streams. In: Icwsm

  53. Song G, Huang D (2021) A sentiment-aware contextual model for real-time disaster prediction using twitter data. Fut Int 13(7):163

    Google Scholar 

  54. Sun X, Liu L, Ayorinde A, Panneerselvam J (2021) Ed-swe: event detection based on scoring and word embedding in online social networks for the internet of people. Digit Commun Netw 7(4):559–569

    Article  Google Scholar 

  55. Tandoc EC Jr, Johnson E (2016) Most students get breaking news first from twitter. Newsp Res J 37(2):153–166

    Article  Google Scholar 

  56. Unankard, S, Li, X, Sharaf, M, Zhong, J, Li, X (2014) Predicting elections from social networks based on sub-event detection and sentiment analysis. In: International conference on web information systems engineering, pp. 1–16. Springer

  57. Vieweg, S, Hughes, AL, Starbird, K, Palen, L (2010) Microblogging during two nat- ural hazards events: what twitter may contribute to situational awareness. In: proceedings of the SIGCHI conference on human factors in computing systems, pp. 1079–1088. ACM

  58. Watanabe, K, Ochi, M, Okabe, M, Onai, R (2011) Jasmine: a real-time local-event detection system based on geolocation information propagated to microblogs. In: proceedings of the 20th ACM international conference on information and knowledge management, pp. 2541–2544. ACM

  59. Wei, Y, Singh, L (2017) Location-based event detection using geotagged semantic graphs. In: KDD Workshop Mining and Learning with Graphs

  60. Weng, J, Lee, B-S (2011) Event detection in twitter ICWSM 11, 401–408

  61. Yang, H, Chen, S, Lyu, MR, King, I (2011) Location-based topic evolution. In: Pro- ceedings of the 1st international workshop on Mobile location-based service, pp. 89–98. ACM

  62. Zeng, K, Liu, Y, Song, X, Zhou, B (2021) Behind: a 4w-oriented method for event detection from twitter. In: Int Conf Softw Eng Knowl Eng https://doi.org/10.18293/seke2021-092

  63. Zhao, S, Gao, Y, Ding, G, Chua, T-S (2017) Real-time multimedia social event detection in microblog. IEEE transactions on Cybernetics

  64. Zhou S, Kan P, Huang Q, Silbernagel J (2023) A guided latent dirichlet allocation approach to investigate real-time latent topics of twitter data during hurricane laura. J Inf Sci 49(2):465–479

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Digvijay Pandey.

Ethics declarations

Conflict of interests

The authors declare that they have no known conflict of interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Singh, J., Pandey, D. & Singh, A.K. Event detection from real-time twitter streaming data using community detection algorithm. Multimed Tools Appl 83, 23437–23464 (2024). https://doi.org/10.1007/s11042-023-16263-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-16263-3

Keywords

Navigation