Abstract
Scientists in the Earth and Environmental Sciences (EES) domain increasingly use ontologies to analyze and integrate their data. For example, the NASA’s SWEET ontologies (Semantic Web for Earth and Environmental Terminology) have become the de facto standard ontologies to represent the EES domain formally (Raskin 2010). Now we must develop principled ways both to evaluate existing ontologies and to ascertain their quality in a quantitative manner. Existing literature describes many potential quality metrics for ontologies. Among these metrics is the coverage metric, which approximates the relevancy of an ontology to a corpus (Yao et al. (PLoS Comput Biol 7(1):e1001055+, 2011)). This paper has three primary contributions to the EES domain: (1) we present an investigation of the applicability of existing coverage techniques for the EES domain; (2) we present a novel expansion of existing techniques that uses thesauri to generate equivalence and subclass axioms automatically; and (3) we present an experiment to establish an upper-bound coverage expectation for the SWEET ontologies against real-world EES corpora from DataONE (Michener et al. (Ecol Inform 11:5–15, 2012)), and a corpus designed from research articles to specifically match the topics covered by the SWEET ontologies. This initial evaluation suggests that the SWEET ontology can accurately represent real corpora within the EES domain.
Similar content being viewed by others
References
Bird S, Loper E, Klein E (2009) Natural language processing with python
Brank J, Grobelnik M, Mladenić D (2005) A survey of ontology evaluation techniques. In: Conference on data mining and data warehouses (SiKDD 2005)
Cimiano P, Hotho A, Staab S (2005) Learning concept hierarchies from text corpora using formal concept analysis. J Artif Intell Res (JAIR) 24:305–339
Dellschaft K, Staab S (2006) On how to perform a gold standard based evaluation of ontology learning. In: The Semantic Web-ISWC 2006. Springer, pp 228–241
Devlin J (1961) A dictionary of synonyms and antonyms
Doan A, Madhavan J, Dhamankar R, Domingos P, Halevy A (2003) Learning to match ontologies on the semantic web. VLDB J - Int J Very Large Data Bases 12(4):303–319
Gibson A, Wolstencroft K, Stevens R (2007) Promotion of ontological comprehension: Exposing terms and metadata with web 2.0. In: Workshop on social and collaborative construction of structured knowledge at WWW 2007
Hahn U, Schnattinger K (1998) Towards text knowledge engineering. Hypothesis 1:2
Hoehndorf R, Dumontier M, Gkoutos GV (2012) Evaluation of research in biomedical ontologies. Briefings in Bioinformatics
S I (2001) Scholastic dictionary of synonyms, antonyms, and homonyms
Kauppinen T, Pouchard L, Kessler C (2011) Proceedings of the First International Workshop on Linked Science (LISC 2011), volume CEUR Workshop Proceedings, p 783
Kipfer BA (1993) 21st century synonym and antonym finder. Dell
Laird CG (2003) Webster’s New World Roget’s A-Z Thesaurus. SimonandSchuster.com
LaRoche N, Rodale JJI, Urdang L (1978) The Synonym Finder. Rodale
Lawrie D, Binkley D, Morrell C (2010) Normalizing source code vocabulary. In: 17th Working Conference on Reverse Engineering (WCRE) 2010. IEEE, pp 3–12
Lynnes C (2012) Toolmatch. Proceedings of the ESIP Summer Meeting
Maedche A, Staab S (2002) Measuring similarity between ontologies. In: Knowledge engineering and knowledge management: Ontologies and the semantic web. Springer, pp 251–263
Maynard D, Peters W, Li Y (2006) Metrics for evaluation of ontology-based information extraction. In: International world wide web conference
Michener WK, Allard S, Budden A, Cook RB, Douglass K, Frame M, Kelling S, Koskela R, Tenopir C, Vieglais DA (2012) Participatory design of dataoneenabling cyberinfrastructure for the biological and environmental sciences. Ecol Inform 11:5–15
Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):2
Pouchard L, Cook R, Green J, Palanisamy G, Noy N (2011) Semantic technologies improving the recall and precision of the mercury metadata search engine. AGU Fall Meet Abstr 1:1437
Raskin RG (2010) SWEET 2.1 Ontologies. AGU Fall Meeting Abstracts, p B6+
Rozell E, Fox P, Zheng J, Hendler J (2012) S2s architecture and faceted browsing applications. In: Proceedings of the 21st international conference companion on World Wide Web, WWW ’12 Companion. ACM, New York, pp 413–416
Spooner A (2007) The Oxford dictionary of synonyms and antonyms. Oxford University Press
Tripathi A, Babaie HA (2008) Developing a modular hydrogeology ontology by extending the sweet upper-level ontologies. Comput Geosci 34(9):1022–1033
Verspoor K, Cohn J, Mniszewski S, Joslyn C (2006) A categorization approach to automated ontological function annotation. Protein Sci 15(6):1544–1549
Wiegand N, Garcia C (2007) A task-based ontology approach to automate geospatial data retrieval. Trans GIS 11(3):355–376
Yao L, Divoli A, Mayzus I, Evans JA, Rzhetsky A (2011) Benchmarking Ontologies: Bigger or Better?PLoS Comput Biol 7(1):e1001055+
Acknowledgments
This material is based upon work supported by the National Science Foundation, through Award CCF-1116943 and through Graduate Research Fellowship under Grant No. DGE-0808392. Michael Huhns was extremely helpful in directing and crystalizing this research. We would also like to thank Andrey Rzhetsky for providing the seven thesauri used in our experiment.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: H. A. Babaie
Rights and permissions
About this article
Cite this article
DiGiuseppe, N., Pouchard, L.C. & Noy, N.F. SWEET ontology coverage for earth system sciences. Earth Sci Inform 7, 249–264 (2014). https://doi.org/10.1007/s12145-013-0143-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-013-0143-1