skip to main content
10.1145/3487553.3524624acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
short-paper

Towards Building Live Open Scientific Knowledge Graphs

Published: 16 August 2022 Publication History

Abstract

Due to the large number and heterogeneity of data sources, it becomes increasingly difficult to follow the research output and the scientific discourse. For example, a publication listed on DBLP may be discussed on Twitter and its underlying data set may be used in a different paper published on arXiv. The scientific discourse this publication is involved in is divided among not integrated systems, and for researchers it might be very hard to follow all discourses a publication or data set may be involved in. Also, many of these data sources—DBLP, arXiv, or Twitter, to name a few—are often updated in real-time. These systems are not integrated (silos), and there is no system for users to query the content/data actively or, what would be even more beneficial, in a publish/subscribe fashion, i.e., a system would actively notify researchers of work interesting to them when such work or discussions become available.
In this position paper, we introduce our concept of a live open knowledge graph which can integrate an extensible set of existing or new data sources in a streaming fashion, continuously fetching data from these heterogeneous sources, and interlinking and enriching it on-the-fly. Users can subscribe to continuously query the content/data of their interest and get notified when new content/data becomes available. We also highlight open challenges in realizing a system enabling this concept at scale.

References

[1]
John G Breslin, Andreas Harth, Uldis Bojars, and Stefan Decker. 2005. Towards semantically-interlinked online communities. In European semantic web conference. Springer, 500–514.
[2]
Sarven Capadisli, Amy Guy, Christoph Lange, Sören Auer, Andrei Sambra, and Tim Berners-Lee. 2017. Linked data notifications: a resource-centric communication protocol. In European Semantic Web Conference. Springer, 537–553.
[3]
Vassilis Christophides, Vasilis Efthymiou, Themis Palpanas, George Papadakis, and Kostas Stefanidis. 2021. An Overview of End-to-End Entity Resolution for Big Data. ACM Comput. Surv. 53, 6 (2021), 127:1–127:42. https://doi.org/10.1145/3418896
[4]
Daniele Dell’Aglio, Minh Dao-Tran, Jean-Paul Calbimonte, Danh Le Phuoc, and Emanuele Della Valle. 2016. A Query Model to Capture Event Pattern Matching in RDF Stream Processing Query Languages. In Knowledge Engineering and Knowledge Management - 20th International Conference, EKAW 2016, Bologna, Italy, November 19-23, 2016, Proceedings(Lecture Notes in Computer Science, Vol. 10024), Eva Blomqvist, Paolo Ciancarini, Francesco Poggi, and Fabio Vitali (Eds.). 145–162. https://doi.org/10.1007/978-3-319-49004-5_10
[5]
Pavlos Fafalios, Vasileios Iosifidis, Eirini Ntoutsi, and Stefan Dietze. 2018. Tweetskb: A public and large-scale rdf corpus of annotated tweets. In European Semantic Web Conference. Springer, 177–190.
[6]
Wenfei Fan, Tao He, Longbin Lai, Xue Li, Yong Li, Zhao Li, Zhengping Qian, Chao Tian, Lei Wang, Jingbo Xu, Youyang Yao, Qiang Yin, Wenyuan Yu, Kai Zeng, Kun Zhao, Jingren Zhou, Diwen Zhu, and Rong Zhu. 2021. GraphScope: A Unified Engine For Big Graph Processing. Proc. VLDB Endow. 14, 12 (2021), 2879–2892. http://www.vldb.org/pvldb/vol14/p2879-qian.pdf
[7]
Wenfei Fan, Chunming Hu, and Chao Tian. 2017. Incremental Graph Computations: Doable and Undoable. In Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14-19, 2017, Semih Salihoglu, Wenchao Zhou, Rada Chirkova, Jun Yang, and Dan Suciu (Eds.). ACM, 155–169. https://doi.org/10.1145/3035918.3035944
[8]
Wenfei Fan, Chao Tian, Ruiqi Xu, Qiang Yin, Wenyuan Yu, and Jingren Zhou. 2021. Incrementalizing Graph Algorithms. In SIGMOD ’21: International Conference on Management of Data, Virtual Event, China, June 20-25, 2021, Guoliang Li, Zhanhuai Li, Stratos Idreos, and Divesh Srivastava (Eds.). ACM, 459–471. https://doi.org/10.1145/3448016.3452796
[9]
Ramanathan V Guha, Dan Brickley, and Steve Macbeth. 2016. Schema. org: evolution of structured data on the web. Commun. ACM 59, 2 (2016), 44–51.
[10]
Danh Le-Phuoc, Thomas Eiter, and Anh Lê Tuán. 2021. A Scalable Reasoning and Learning Approach for Neural-Symbolic Stream Fusion. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 2-9, 2021. AAAI Press, 4996–5005. https://ojs.aaai.org/index.php/AAAI/article/view/16633
[11]
George Papadakis, Ekaterini Ioannou, Emanouil Thanos, and Themis Palpanas. 2021. The Four Generations of Entity Resolution. Morgan & Claypool Publishers. https://doi.org/10.2200/S01067ED1V01Y202012DTM064
[12]
Botao Peng, Panagiota Fatourou, and Themis Palpanas. 2021. ParIS+: Data Series Indexing on Multi-Core Architectures. IEEE Trans. Knowl. Data Eng. 33, 5 (2021), 2151–2164. https://doi.org/10.1109/TKDE.2020.2975180
[13]
Silvio Peroni and David Shotton. 2012. FaBiO and CiTO: ontologies for describing bibliographic resources and citations. Journal of Web Semantics 17 (2012), 33–43.
[14]
Riccardo Tommasini, Yehia Abo Sedira, Daniele Dell’Aglio, Marco Balduini, Muhammad Intizar Ali, Danh Le Phuoc, Emanuele Della Valle, and Jean-Paul Calbimonte. 2018. VoCaLS: Vocabulary and Catalog of Linked Streams. In The Semantic Web - ISWC 2018 - 17th International Semantic Web Conference, Monterey, CA, USA, October 8-12, 2018, Proceedings, Part II(Lecture Notes in Computer Science, Vol. 11137), Denny Vrandecic, Kalina Bontcheva, Mari Carmen Suárez-Figueroa, Valentina Presutti, Irene Celino, Marta Sabou, Lucie-Aimée Kaffee, and Elena Simperl (Eds.). Springer, 256–272. https://doi.org/10.1007/978-3-030-00668-6_16
[15]
Linda van den Brink, Payam M. Barnaghi, Jeremy Tandy, Ghislain Atemezing, Rob Atkinson, Byron Cochrane, Yasmin Fathy, Raúl García-Castro, Armin Haller, Andreas Harth, Krzysztof Janowicz, Sefki Kolozali, Bart van Leeuwen, Maxime Lefrançois, Joshua Lieberman, Andrea Perego, Danh Le Phuoc, Bill Roberts, Kerry Taylor, and Raphaël Troncy. 2019. Best practices for publishing, retrieving, and using spatial data on the web. Semantic Web 10, 1 (2019), 95–114. https://doi.org/10.3233/SW-180305

Cited By

View all
  • (2024)RDF Stream Taxonomy: Systematizing RDF Stream Types in Research and PracticeElectronics10.3390/electronics1313255813:13(2558)Online publication date: 29-Jun-2024

Index Terms

  1. Towards Building Live Open Scientific Knowledge Graphs

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '22: Companion Proceedings of the Web Conference 2022
    April 2022
    1338 pages
    ISBN:9781450391306
    DOI:10.1145/3487553
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 August 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. knowledge graph
    2. open data
    3. scientific publications dataset

    Qualifiers

    • Short-paper
    • Research
    • Refereed limited

    Funding Sources

    • Berlin Big Data Center and Berlin Institute for the Foundations of Learning and Data
    • Cosmopolitan Diabetes Foundation
    • Weizenbaum Institute, ?Deutsches Internet-Institut
    • NFDI4DataScience

    Conference

    WWW '22
    Sponsor:
    WWW '22: The ACM Web Conference 2022
    April 25 - 29, 2022
    Virtual Event, Lyon, France

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)15
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)RDF Stream Taxonomy: Systematizing RDF Stream Types in Research and PracticeElectronics10.3390/electronics1313255813:13(2558)Online publication date: 29-Jun-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media