Skip to main content

Identifying Events from Streams of RDF-Graphs Representing News and Social Media Messages

  • Conference paper
  • First Online:
The Semantic Web: ESWC 2021 Satellite Events (ESWC 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12739))

Included in the following conference series:

Abstract

Identifying news events and relating current news to past events or already identified ones is an open challenge for news agencies. In this paper, I propose a study to identify events from semantic RDF graph representations of real-time and big data streams of news and pre-news. The proposed solution must provide acceptable accuracy over time and consider the requirements of incremental clustering, big data and real-time streams. To design a solution for identifying events, I want to study which clustering approaches are best for this purpose including methods for clustering RDF graphs using machine learning and “classical” algorithmic approaches. I also present three different evaluation approaches.

Supported by the News Angler project funded by the Norwegian Research Council’s IKTPLUSS programme as project 275872.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://en.wikipedia.org/wiki/Portal:Current_events.

  2. 2.

    https://newsapi.org.

  3. 3.

    https://en.wikipedia.org/wiki/Wikipedia:How_the_Current_events_page_works.

References

  1. Al-Moslmi, T., Gallofré Ocaña, M.: Lifting news into a journalistic knowledge platform. In: Proceedings of the CIKM 2020 Workshops. Galway, Ireland (2020)

    Google Scholar 

  2. Ali, M., Mohamed, Y.: A method for clustering unlabeled BIM objects using entropy and TF-IDF with RDF encoding. Adv. Eng. Inform. 33, 154–163 (2017). https://doi.org/10.1016/j.aei.2017.06.005

    Article  Google Scholar 

  3. Araki, J., Mitamura, T.: Open-domain event detection using distant supervision. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 878–891. Association for Computational Linguistics, Santa Fe, New Mexico, USA, August 2018. https://www.aclweb.org/anthology/C18-1075

  4. Bai, Y., Ding, H., Bian, S., Chen, T., Sun, Y., Wang, W.: SimGNN: a neural network approach to fast graph similarity computation (2020)

    Google Scholar 

  5. Bellandi, V., Ceravolo, P., Maghool, S., Siccardi, S.: Graph Embeddings in Criminal Investigation: Extending the Scope of Enquiry Protocols, pp. 64–71. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3415958.3433102

  6. Castells, P., et al.: Neptuno: Semantic Web Technologies for a Digital Newspaper Archive. In: Bussler, C.J., Davies, J., Fensel, D., Studer, R. (eds.) ESWS 2004. LNCS, vol. 3053, pp. 445–458. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-25956-5_31

    Chapter  Google Scholar 

  7. Dami, S., Barforoush, A.A., Shirazi, H.: News events prediction using Markov logic networks. J. Inf. Sci. 44(1), 91–109 (2018). https://doi.org/10.1177/0165551516673285

    Article  Google Scholar 

  8. Eddamiri, S., Zemmouri, E.M., Benghabrit, A.: An improved RDF data clustering algorithm. In: The Second International Conference on Intelligent Computing in Data Science (ICDS2018). vol. 148, pp. 208–217 (2019). https://doi.org/10.1016/j.procs.2019.01.038

  9. Fernández, N., Fuentes, D., Sánchez, L., Fisteus, J.A.: The news ontology: design and applications. Exp. Syst. Appl. 37(12), 8694–8704 (2010). https://doi.org/10.1016/j.eswa.2010.06.055

    Article  Google Scholar 

  10. Florence, R., Nogueira, B., Marcacini, R.: Constrained hierarchical clustering for news events. In: Proceedings of the 21st International Database Engineering & Applications Symposium (IDEAS 2017), pp. 49–56. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3105831.3105859

  11. Gallofré Ocaña, M., Nyre, L., Opdahl, A.L., Tessem, B., Trattner, C., Veres, C.: Towards a big data platform for news angles. In: 4th Norwegian Big Data Symposium (NOBIDS 2018), pp. 17–29 (2018). http://ceur-ws.org/Vol-2316/paper1.pdf

  12. Gallofré Ocaña, M., Opdahl, A.L.: Challenges and opportunities for journalistic knowledge platforms. In: Proceedings of the CIKM 2020 Workshops. Galway, Ireland (2020)

    Google Scholar 

  13. Germann, U., Liepins, R., Barzdins, G., Gosko, D., Miranda, S., Nogueira, D.: The SUMMA platform: a scalable infrastructure for multi-lingual multi-media monitoring. In: Proceedings of ACL, System Demonstrations, pp. 99–104, July 2018. https://doi.org/10.18653/v1/P18-4017

  14. Grimnes, G.A.A., Edwards, P., Preece, A.: Instance based clustering of semantic web resources. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 303–317. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68234-9_24

    Chapter  Google Scholar 

  15. Hamborg, F., Meuschke, N., Gipp, B.: Bias-aware news analysis using matrix-based news aggregation. Int. J. Digit. Lib. 21(2), 129–147 (2020)

    Article  Google Scholar 

  16. Hogenboom, F., Frasincar, F., Kaymak, U., de Jong, F., Caron, E.: A survey of event extraction methods from text for decision support systems. Decis. Supp. Syst. 85, 12–22 (2016). https://doi.org/10.1016/j.dss.2016.02.006

    Article  Google Scholar 

  17. Huang, L., et al.: Liberal event extraction and event schema induction. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (vol. 1: Long Papers), pp. 258–268 (2016)

    Google Scholar 

  18. Hunter, A., Summerton, R.: Merging news reports that describe events. Data Knowl. Eng. 59(1), 1–24 (2006). https://doi.org/10.1016/j.datak.2005.06.005

    Article  Google Scholar 

  19. Jackoway, A., Samet, H., Sankaranarayanan, J.: Identification of live news events using twitter. In: Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Location-Based Social Networks (LBSN2011), pp. 25–32. Association for Computing Machinery, New York (2011). https://doi.org/10.1145/2063212.2063224

  20. Jin, P., Mu, L., Zheng, L., Zhao, J., Yue, L.: News feature extraction for events on social network platforms. In: International World Wide Web Conferences Steering Committee (WWW 2017) Companion, pp. 69–78. Republic and Canton of Geneva, CHE (2017). https://doi.org/10.1145/3041021.3054151

  21. Krikorian, R.: New tweets per second record, and how! (Aug 2013), https://blog.twitter.com/engineering/en_us/a/2013/new-tweets-per-second-record-and-how.html

  22. Leban, G., Fortuna, B., Brank, J., Grobelnik, M.: Event registry: Learning about world events from news. In: Proceedings of the 23rd International Conference on World Wide Web (WWW 2014) Companion, pp. 107–110. Association for Computing Machinery (2014). https://doi.org/10.1145/2567948.2577024

  23. Liu, X., Nourbakhsh, A., Li, Q., Shah, S., Martin, R., Duprey, J.: Reuters tracer: toward automated news production using large scale social media data. In: 2017 IEEE International Conference on Big Data (Big Data), pp. 1483–1493 (2017). https://doi.org/10.1109/BigData.2017.8258082

  24. Maedche, A., Zacharias, V.: Clustering ontology-based metadata in the semantic web. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS, vol. 2431, pp. 348–360. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45681-3_29

    Chapter  Google Scholar 

  25. Opdahl, A.L., Tessem, B.: Ontologies for finding journalistic angles. Softw. Syst. Model. 20, 1–17 (2020)

    Google Scholar 

  26. Raimond, Y., Scott, T., Oliver, S., Sinclair, P., Smethurst, M.: Use of semantic web technologies on the BBC web sites. In: Wood, D. (ed.) Linking Enterprise Data, pp. 263–283. Springer, Boston (2010). https://doi.org/10.1007/978-1-4419-7665-9_13

  27. Ribeiro, S., Ferret, O., Tannier, X.: Unsupervised event clustering and aggregation from newswire and web articles. In: Proceedings of the 2017 EMNLP Workshop: Natural Language Processing meets Journalism, pp. 62–67. Association for Computational Linguistics, Copenhagen, Denmark, September 2017. https://doi.org/10.18653/v1/W17-4211

  28. Rudnik, C., Ehrhart, T., Ferret, O., Teyssou, D., Troncy, R., Tannier, X.: Searching news articles using an event knowledge graph leveraged by wikidata. In: Companion Proceedings of The 2019 World Wide Web Conference, pp. 1232–1239 (2019). https://doi.org/10.1145/3308560.3316761

  29. Setty, V., Hose, K.: Event2vec: Neural embeddings for news events. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR 2018), pp. 1013–1016. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3209978.3210136

  30. Vossen, P., et al.: Newsreader: Using knowledge resources in a cross-lingual reading machine to generate more knowledge from massive streams of news. Special Issue Knowledge-Based Systems, Elsevier 110, 60–85 (2016). https://doi.org/10.1016/j.knosys.2016.07.013

    Article  Google Scholar 

  31. Vázquez Herrero, J., Direito-Rebollal, S., Rodríguez, A.S., García, X.: Journalistic Metamorphosis: Media Transformation in the Digital Age. Springer International Publishing (2020). https://doi.org/10.1007/978-3-030-36315-4

  32. Xiang, W., Wang, B.: A survey of event extraction from text. IEEE Access 7, 173111–173137 (2019). https://doi.org/10.1109/ACCESS.2019.2956831

    Article  Google Scholar 

Download references

Acknowledgements

Thesis supervised by Prof. Andreas L. Opdahl and co-supervised by Bjørnar Tessem.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marc Gallofré Ocaña .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gallofré Ocaña, M. (2021). Identifying Events from Streams of RDF-Graphs Representing News and Social Media Messages. In: Verborgh, R., et al. The Semantic Web: ESWC 2021 Satellite Events. ESWC 2021. Lecture Notes in Computer Science(), vol 12739. Springer, Cham. https://doi.org/10.1007/978-3-030-80418-3_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-80418-3_31

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-80417-6

  • Online ISBN: 978-3-030-80418-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics