Abstract
Detecting important events in high volume news streams is an important task for a variety of purposes. The volume and rate of online news increases the need for automated event detection methods that can operate in real time. In this paper we develop a network-based approach that makes the working assumption that important news events always involve named entities (such as persons, locations and organizations) that are linked in news articles. Our approach uses natural language processing techniques to detect these entities in a stream of news articles and then creates a time-stamped series of networks in which the detected entities are linked by co-occurrence in articles and sentences. In this prototype, weighted node degree is tracked over time and change-point detection used to locate important events. Potential events are characterized and distinguished using community detection on KeyGraphs that relate named entities and informative noun-phrases from related articles. This methodology already produces promising results and will be extended in future to include a wider variety of complex network analysis techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aiello, L.M., Petkos, G., Martin, C., Corney, D., Papadopoulos, S., Skraba, R., Goker, A., Kompatsiaris, Y., Jaimes, A.: Sensing trending topics in Twitter. Trans. Multimedia 15(6), 1268–1282 (2013)
Allan, J., Carbonell, J., Doddington, G., Yamron, J., Yang, Y.: Topic detection and tracking pilot study: final report. In: Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop (1998)
Wayne, C.: Topic detection and tracking (TDT): overview & perspective. In: Proceedings of the DARPA BNTUW (1998)
Allan, J., Lavrenko, V.: On-line new event detection and tracking. In: SIGIR98: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1998)
Petrović, S., Osborne, M., Lavrenko, V.: Streaming first story detection with application to Twitter. In: HLT: Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 181–189. Association for Computational Linguistics, Stroudsburg (2010)
He, Q., Chang, K., Lim, E.P.: Analyzing feature trajectories for event detection. In: 30th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 207–214 (2007)
Weng, J., Lee, B.S.: Event detection in Twitter. In: International AAAI Conference on Web and Social Media (2011)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Li, H., Yamanishi, K.: Topic analysis using a finite mixture model. In: Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP), pp. 35–44 (2000)
Wartena, C., Brussee, R.: Topic detection by clustering keywords. In: Proceedings of the IEEE Computer Society DEXA Workshops, pp. 54–58 (2008)
Prabowo, R., Thelwall, M., Hellsten, I., Scharnhorst, A.: Evolving debates in online communication: a graph analytical approach. Internet Res. 18, 520–540 (2008)
Fung, G.P.C., Yu, J.X., Yu, P.S., Lu, H.: Parameter free bursty events detection in text streams. In: 31st International Conference on Very Large Data Bases, pp. 181–192 (2005)
Popescu, A.M., Pennacchiotti, M., Paranjpe, D.: Extracting events and event descriptions from Twitter. In: 20th International Conference Companion on World Wide Web, pp. 105–106 (2011)
Melvin, S., Yu, W., Ju, P., Young, S., Wang, W.: Event detection and summarization using phrase network. In: Altun, Y., et al. (eds.) Machine Learning and Knowledge Discovery in Databases, vol. 10536. Springer, Cham (2017)
Sayyadi, H., Hurst, M., Maykov, A.: Event detection and tracking in social streams. In: Proceedings of International AAAI Conference on Weblogs and Social Media (2009)
Blondel, V.D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theo. Exp. 2008, 10 (2008)
Finkel, J.R., Grenager, T. and Manning, C: Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistic, pp. 363–370 (2005)
Ritter, A., Clark, S., Mausam, Etzioni, O.: Named entity recognition in tweets: an experimental study. In: Conference on Empirical Methods in Natural Language Processing, pp. 1524–1534. Association for Computational Linguistics, Stroudsburg (2011)
El-Kishky, A., Song, Y., Wang, C.R., Han, J.: Scalable topical phrase mining from text corpora. Proc. VLDB Endowment 8, 305–316 (2014)
O’Connor, B., Krieger, M., Ahn, D.: TweetMotif: exploratory search and topic summarization for Twitter. In: Cohen, W.W., Gosling, S., (eds.) ICWSM. The AAAI Press (2010)
Porter, M.F.: An algorithm for suffix stripping. In: Jones, K.S., Willett, P. (eds.) Readings in Information Retrieval, pp. 313–316. Morgan Kaufmann Publishers Inc., San Francisco (1997)
Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1986)
Xu, X., Yuruk, N., Feng, Z., Schweiger, T.A.J.: SCAN: a structural clustering algorithm for networks. In: KDD: 13th ACM International Conference on Knowledge Discovery and Data Mining, pp. 824–833. ACM, New York (2007)
Goethals, B.: Frequent set mining. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 377–397. Springer, Boston (2005)
Murtagh, F.: A survey of recent advances in hierarchical clustering algorithms. Comput. J. 26(4), 354–359 (1983)
Lloyd, S.P.: Least squares quantization in PCM. IEEE Trans. Inf. Theor 28(2), 129–137 (1982)
Acknowledgements
The authors acknowledge funding from a commercial entity, Adarga Ltd. (https://www.adarga.ai). The funder had no input or editorial influence over the manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Moutidis, I., Williams, H.T.P. (2020). Utilizing Complex Networks for Event Detection in Heterogeneous High-Volume News Streams. In: Cherifi, H., Gaito, S., Mendes, J., Moro, E., Rocha, L. (eds) Complex Networks and Their Applications VIII. COMPLEX NETWORKS 2019. Studies in Computational Intelligence, vol 881. Springer, Cham. https://doi.org/10.1007/978-3-030-36687-2_55
Download citation
DOI: https://doi.org/10.1007/978-3-030-36687-2_55
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36686-5
Online ISBN: 978-3-030-36687-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)