Skip to main content

TwitterNews+: A Framework for Real Time Event Detection from the Twitter Data Stream

  • Conference paper
  • First Online:
Social Informatics (SocInfo 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10046))

Included in the following conference series:

Abstract

In recent years, substantial research efforts have gone into investigating different approaches to the detection of events in real time from the Twitter data stream. Most of these approaches, however, suffer from a high computational cost and are not evaluated using a publicly available corpus, thus making it difficult to properly compare them. In this paper, we propose a scalable event detection system, TwitterNews+, to detect and track newsworthy events in real time. TwitterNews+ uses a novel approach to cluster event related tweets from Twitter with a significantly lower computational cost compared to the existing state-of-the-art approaches. Finally, we evaluate the effectiveness of TwitterNews+ using a publicly available corpus and its associated ground truth data set of newsworthy events. The result of the evaluation shows a significant improvement, in terms of recall and precision, over the baselines we have used.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.internetlivestats.com/twitter-statistics/.

  2. 2.

    http://www.journalism.org/media-indicators/digital-top-50-online-news-entities-2015/.

References

  1. Atefeh, F., Khreich, W.: A survey of techniques for event detection in Twitter. Comput. Intell. 31(1), 132–164 (2015). Wiley Online Library

    Article  MathSciNet  Google Scholar 

  2. Mathioudakis, M., Koudas, N.: TwitterMonitor: trend detection over the Twitter stream. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, NY, USA, pp. 1155–1158. ACM, New York (2010)

    Google Scholar 

  3. Alvanaki, F., Sebastian, M., Ramamritham, K., Weikum, G.: Enblogue: emergent topic detection in web 2.0 streams. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, NY, USA, pp. 1271–1274. ACM, New York (2011)

    Google Scholar 

  4. Gaglio, S., Re, G.L., Morana, M.: A framework for real-time Twitter data analysis. Comput. Commun. 73, 236–242 (2016). Elsevier

    Article  Google Scholar 

  5. Xie, R., Zhu, F., Ma, H., Xie, W., Lin, C.: CLEar: a real-time online observatory for bursty and viral events. Proc. VLDB Endowment 7(13), 1637–1640 (2014). VLDB Endowment

    Article  Google Scholar 

  6. Li, J., Wen, J., Tai, Z., Zhang, R., Yu, W.: Bursty event detection from microblog: a distributed and incremental approach. In: Concurrency and Computation:Practice and Experience. Wiley Online Library (2015)

    Google Scholar 

  7. Cai, H., Yang, Y., Li, X., Huang, Z.: What are popular: exploring Twitter features for event detection, tracking and visualization. In: Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, pp. 89–98. ACM (2015)

    Google Scholar 

  8. Petrović, S., Osborne, M., Lavrenko, V.: Streaming first story detection with application to Twitter. In: Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, HLT 2010, ACL, Stroudsburg, PA, USA, pp. 181–189 (2010)

    Google Scholar 

  9. McMinn, A.J., Jose, J.M.: Real-time entity-based event detection for Twitter. In: Mothe, J., Savoy, J., Kamps, J., Pinel-Sauvagnat, K., Jones, G.J.F., SanJuan, E., Cappellato, L., Ferro, N. (eds.) CLEF 2015. LNCS, vol. 9283, pp. 65–77. Springer, Heidelberg (2015). doi:10.1007/978-3-319-24027-5_6

    Chapter  Google Scholar 

  10. Hasan, M., Orgun, M.A., Schwitter, R.: TwitterNews: real time event detection from the Twitter data stream. PeerJ PrePrints 4, e2297v1 (2016)

    Google Scholar 

  11. Sahlgren, M.: An introduction to random indexing. In: Proceedings of the Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminology and Knowledge Engineering, TKE, vol. 5 (2005)

    Google Scholar 

  12. Guzman, J., Poblete, B.: On-line relevant anomaly detection in the Twitter stream: an efficient bursty keyword detection model. In: Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description, pp. 31–39. ACM (2013)

    Google Scholar 

  13. Petkos, G., Papadopoulos, S., Aiello, L., Skraba, R., Kompatsiaris, Y.: A soft frequent pattern mining approach for textual topic detection. In: Proceedings of the 4th International Conference on Web Intelligence, Mining and Semantics, WIMS, pp. 25: 1–25: 10. ACM (2014)

    Google Scholar 

  14. Marcus, A., Bernstein, M.S., Badar, O., Karger, D.R., Madden, S., Miller, R.C.: TwitInfo: aggregating and visualizing microblogs for event exploration. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2011, NY, USA, pp. 227–236. ACM, New York (2011)

    Google Scholar 

  15. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, pp. 675–678. ACM (2014)

    Google Scholar 

  16. Derczynski, L., Ritter, A., Clark, S., Bontcheva, K.: Twitter part-of-speech tagging for all: overcoming sparse and noisy data. In: Proceedings of the Recent Advances in Natural Language Processing, RANLP, pp. 198–206 (2013)

    Google Scholar 

  17. Aiello, L.M., Petkos, G., Martin, C., Corney, D., Papadopoulos, S., Skraba, R., Goker, A., Kompatsiaris, I., Jaimes, A.: Sensing trending topics in Twitter. IEEE Trans. Multimedia 15(6), 1268–1282 (2013). IEEE

    Google Scholar 

  18. Stilo, G., Velardi, P.: Efficient temporal mining of micro-blog texts and its application to event discovery. In: Fürnkranz, J. (ed.) Data Mining and Knowledge Discovery, pp. 1–31. Springer, Heidelberg (2015)

    Google Scholar 

  19. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003). JMLR.org

    Google Scholar 

  20. Owoputi, O., O’Connor, B., Dyer, C., Gimpel, K., Schneider, N., Smith, N.A.: Improved part-of-speech tagging for online conversational text with word clusters. In: Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, HLT 2013, ACL, pp. 380–391 (2013)

    Google Scholar 

  21. McMinn, A.J., Moshfeghi, Y., Jose, J.M.: Building a large-scale corpus for evaluating event detection on Twitter. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, CIKM 2013, NY, USA, pp. 409–418. ACM, New York (2013)

    Google Scholar 

  22. Kumar, S., Liu, H., Mehta, S., Subramaniam, L.V.: From tweets to events: exploring a scalable solution for Twitter streams. arXiv preprint arXiv:1405.1392 (2014)

  23. Lehmann, J., Gonçalves, B., Ramasco, J.J., Cattuto, C.: Dynamical classes of collective attention in Twitter. In: Proceedings of the International Conference on World Wide Web, pp. 251–260. ACM (2012)

    Google Scholar 

  24. Yang, J., Leskovec, J.: Patterns of temporal variation in online media. In: Proceedings of the 4th ACM International Conference on Web Search and Data Mining, pp. 177–186. ACM (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mahmud Hasan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Hasan, M., Orgun, M.A., Schwitter, R. (2016). TwitterNews+: A Framework for Real Time Event Detection from the Twitter Data Stream. In: Spiro, E., Ahn, YY. (eds) Social Informatics. SocInfo 2016. Lecture Notes in Computer Science(), vol 10046. Springer, Cham. https://doi.org/10.1007/978-3-319-47880-7_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-47880-7_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-47879-1

  • Online ISBN: 978-3-319-47880-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics