Abstract
The CODO ontology was designed to capture data about the Covid-19 pandemic. The goal of the ontology was to collect epidemiological data about the pandemic so that medical professionals could perform contact tracing and answer questions about infection paths based on information about relations between patients, geography, time, etc. We took information from various spreadsheets and integrated it into one consistent knowledge graph that could be queried with SPARQL and visualized with the Gruff tool in AllegroGraph. The ontology is published on Bioportal and has been used by two projects to date. This paper describes the process used to design the initial ontology and to develop transformations to incorporate data from the Indian government about the pandemic. We went from an ontology to a large knowledge graph with approximately 5M triples in a few months. Our experience demonstrates some common principles that apply to the process of scaling up from an ontology model to a knowledge graph with real-world data.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
- 4.
- 5.
We utilized Lisp because our team was very small (2 people), and the lead developer had the most experience in Lisp and we wanted to work as rapidly as possible to meet the needs of the pandemic. In future versions we will re-implement the functions in Python.
- 6.
The details of the CODO project can be found at: https://github.com/biswanathdutta/CODO. The CODO ontology can be found at: https://bioportal.bioontology.org/ontologies/CODO.
- 7.
See [17] for the actual Lisp and SPARQL code for this and all transformations.
References
Dutta, B., DeBellis, M.: CODO: an ontology for collection and analysis of COVID-19 data. In: Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (2020)
Gardner, L.: Modeling the spreading risk of 2019-nCov (2020). https://systems.jhu.edu/research/public-health/ncov-model-2/
Sirin, E.: Analyzing COVID-19 data with SPARQL (2020). https://www.stardog.com/labs/blog/analyzing-covid-19-data-with-sparql/
He, Y., et al.: CIDO, a community-based ontology for coronavirus disease knowledge and data integration, sharing, and analysis. Sci. Data 7(1), 1–5 (2020)
Cowell, L.G., Smith, B.: Infectious disease ontology. In: Sintchenko, V. (ed.) Infectious Disease Informatics, pp. 373–395. Springer, New York (2010). https://doi.org/10.1007/978-1-4419-1327-2_19
Domingo-Fernández, D., et al.: COVID-19 knowledge graph: a computable, multi-modal, cause-and-effect knowledge model of COVID-19 pathophysiology. Bioinformatics 37(9), 1332–1334 (2021)
Lin, A., et al.: A community effort for COVID-19 ontology harmonization. In: The 12th International Conference on Biomedical Ontologies (2021)
Dutta, B., Chatterjee, U., Madalli, D.P.: YAMO: yet another methodology for large-scale faceted ontology construction. J. Knowl. Manag. 19(1), 6–24 (2015)
Beck, K.: Extreme Programming Explained: Embrace Change. Addison-Wesley Professional, Boston (2000)
Kroll, P., Kruchten, P.: The Rational Unified Process Made Easy: A Practitioner’s Guide to the RUP: A Practitioner’s Guide to the RUP. Addison-Wesley Professional, Boston (2003)
O’Connor, M.J., Halaschek-Wiener, C., Musen, M.A., et al.: Mapping master: a flexible approach for mapping spreadsheets to OWL. In: Patel-Schneider, F. (ed.) ISWC 2010. LNCS, vol. 6497, pp. 194–208. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17749-1_13
Musen, M.A.: The protégé project: a look back and a look forward. AI Matters 1(4), 4–12 (2015)
SWRL: A semantic web rule language combining OWL and ruleML. W3C Member Submission 21 May 2004 (2016). Accessed Feb 2016
Aasman, J., Cheetham, K.: RDF browser for data discovery and visual query building. In: Proceedings of the Workshop on Visual Interfaces to the Social and Semantic Web (VISSW 2011), Co-located with ACM IUI, p. 53 (2011)
Aasman, J.: Unification of geospatial reasoning, temporal logic, & social network analysis in event-based systems. In: Proceedings of the Second International Conference on Distributed Event-Based Systems, pp. 139–145 (2008)
Singhal, A.: Introducing the knowledge graph: things, not strings. Official Google Blog 5, 16 (2012)
Debellis, M.: Lisp and SPARQL files for CODO ontology (2020). https://github.com/mdebellis/CODO-Lisp
Chansanam, W., Suttipapa, K., Ahmad, A.R.: COVID-19 ontology evaluation. Int. J. Manag. (IJM) 11(8), 47–57 (2020)
Kejriwal, M.: Knowledge graphs and COVID-19: opportunities, challenges, and implementation. Harv. Data Sci. Rev. (2020)
Arp, R., Smith, B., Spear, A.D.: Building Ontologies with Basic Formal Ontology. MIT Press, Cambridge (2015)
de Almeida Falbo, R.: SABiO: systematic approach for building ontologies. In: CEUR Workshop Proceedings, vol. 1301 (2014)
Garcia, A., et al.: Developing ontologies within decentralised settings. In: Chen, H., Wang, Y., Cheung, K.H. (eds.) Semantic e-Science. AOIS, vol. 11, pp. 99–139. Springer, Boston (2010). https://doi.org/10.1007/978-1-4419-5908-9_4
Keet, C.M., et al.: The use of foundational ontologies in ontology development: an empirical assessment. In: Antoniou, G. (ed.) ESWC 2011. LNCS, vol. 6643, pp. 321–335. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21034-1_22
Boehm, B.W.: A spiral model of software development and enhancement. Computer 21(5), 61–72 (1988)
Pallozzi, D.: The word that took the tech world by storm: returning to the roots of agile (2018). https://www.thoughtworks.com/en-in/perspectives/edition1-agile/article
Acknowledgement
This work has been supported by Indian Statistical Institute, Kolkata. This work was conducted using the Protégé resource, which is supported by grant GM10331601 from the National Institute of General Medical Sciences of the United States National Institutes of Health. Thanks to Franz Inc. (http://www.allegrograph.com) for their help with AllegroGraph and Gruff. Thanks to Dr. Sivaram Arabandi, MD for his feedback on the CODO ontology.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
DeBellis, M., Dutta, B. (2021). The Covid-19 CODO Development Process: an Agile Approach to Knowledge Graph Development. In: Villazón-Terrazas, B., Ortiz-Rodríguez, F., Tiwari, S., Goyal, A., Jabbar, M. (eds) Knowledge Graphs and Semantic Web. KGSWC 2021. Communications in Computer and Information Science, vol 1459. Springer, Cham. https://doi.org/10.1007/978-3-030-91305-2_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-91305-2_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-91304-5
Online ISBN: 978-3-030-91305-2
eBook Packages: Computer ScienceComputer Science (R0)