Abstract
The -omics data revolution, galvanized by the development of the web, has resulted in large numbers of valuable public databases and repositories. Scientists wishing to employ this data for their research are faced with the question of how to approach data integration. Ad hoc solutions can result in diminished generality, interoperability, and reusability, as well as loss of data provenance. One of the promising notions that the Semantic Web brings to the life sciences is that experimental data can be described with relevant life science terms and concepts. Subsequent integration and analysis can then take advantage of those terms, exposing logic that might otherwise only be available from the interpretation of program code. In the context of a biological use case, we examine a general semantic web approach to integrating experimental measurement data with Semantic Web tools such as Protégé and Sesame. The approach to data integration that we define is based on the linking of data with OWL classes. The general pattern that we apply consists of 1) building application-specific ontologies for “myModel” 2) identifying the concepts involved in the biological hypothesis, 3) finding data instances of the concepts, 4) finding a common domain to be used for integration, and 5) integrating the data. Our experience with current tools indicates a few semantic web bottlenecks such as a general lack of ‘semantic disclosure’ from public data resources and the need for better ‘interval join’ performance from RDF query engines.
An erratum to this chapter can be found at http://dx.doi.org/10.1007/11915034_125.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Searls, D.B.: Data integration: challenges for drug discovery. Nat. Rev. Drug Discov. 4(1), 45–58 (2005)
Stein, L.D.: Integrating biological databases. Nat. Rev. Genet. 4(5), 337–345 (2003)
Strahl, B.D., Allis, C.D.: The language of covalent histone modifications. Nature 403(6765), 41–45 (2000)
About BIRNLex http://xwiki.nbirn.net:8080/xwiki/bin/view/BIRN-OTF/About+BIRNLex
Rule Interchange Format Working Group Charter, http://www.w3.org/2005/rules/wg/charter
SWBP&D WG Semantic Web Tutorials, http://www.w3.org/2001/sw/BestPractices/Tutorials
Smith, B., et al.: Relations in biomedical ontologies. Genome Biol. 6(5), R46 (2005)
Ashburner, M., et al.: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25(1), 25–29 (2000)
Ding, L., et al.: Swoogle: a search and metadata engine for the semantic web. In: Proceedings of the thirteenth ACM international conference on Information and knowledge management, pp. 652–659. ACM Press, Washington (2004)
Protégé, http://protege.stanford.edu/
Knublauch, H., Dameron, O., Musen, M.A.: Weaving the Biomedical Semantic Web with the Protégé OWL Plugin. In: First International Workshop on Formal Biomedical Knowledge Representation (KR-MED 2004) (Whistler (BC, Canada)), pp. 33–47. American Medical Informatics Association (2004)
OWLDocs of Overview Ontology for myModel, http://integrativebioinformatics.nl/histone/OWLDocs/OverviewOntology/index.html
Perini, L.: Explanation in Two Dimensions: Diagrams and Biological Explanation. Biology and Philosophy 20, 257–269 (2005)
Gribskov, M.: Challenges in data management for functional genomics. Omics 7(1), 3–5 (2003)
Kent, W.J., et al.: The human genome browser at UCSC. Genome Res. 12(6), 996–1006 (2002)
Kent, W.J.: BLAT–the BLAST-like alignment tool. Genome Res. 12(4), 656–664 (2002)
Cheung, K.H., et al.: YeastHub: a semantic web use case for integrating data in the life sciences domain. Bioinformatics 21(suppl. 1), i85–i96 (2005)
Semantic Web for the life sciences discussion forum, http://lists.w3.org/Archives/Public/public-semweb-lifesci/
Navigate data with the Mapper framework, Build your own data mapping system with an interlingual approach, http://www.javaworld.com/javaworld/jw-04-2002/jw-0426-mapper.html
Semantic Data Integration for Histone Use Case Website, http://integrativebioinformatics.nl/semanticdataintegration.html
Schubeler, D., et al.: The histone modification pattern of active genes revealed through genome-wide chromatin analysis of a higher eukaryote. Genes Dev. 18(11), 1263–1271 (2004)
Pitfalls in Benchmarking Triple Stores, http://jeenbroekstra.blogspot.com/2006/02/pitfalls-in-benchmarking-triple-stores.html
Alink, W., et al.: Efficient XQuery Support for Stand-Off Annotation. In: Proceedings of International Workshop on XQuery Implementation, Experience and Perspectives (XIME-P) (Chicago, IL, USA) (2006)
Eckman, B., Rice, J., Schwarz, P.: Data management in molecular and cell biology: vision and recommendations. Omics 7(1), 93–97 (2003)
Zdobnov, E.M., et al.: The EBI SRS server-new features. Bioinformatics 18(8), 1149–1150 (2002)
Ritter, O., et al.: Prototype implementation of the integrated genomic database. Comput. Biomed. Res. 27(2), 97–115 (1994)
Birkland, A., Yona, G.: BIOZON: a hub of heterogeneous biological data. Nucleic Acids Res. 34(Database issue), D235–242 (2006)
Wilkinson, M., et al.: BioMOBY successfully integrates distributed heterogeneous bioinformatics Web Services. The PlaNet exemplar case. Plant Physiol. 138(1), 5–17 (2005)
Stevens, R.D., Robinson, A.J., Goble, C.A.: myGrid: personalised bioinformatics on the information grid. Bioinformatics 19(suppl. 1), i302–304 (2003)
Ben Miled, Z., et al.: An efficient implementation of a drug candidate database. J. Chem. Inf. Comput. Sci. 43(1), 25–35 (2003)
Mork, P., Shaker, R., Tarczy-Hornoch, P.: The Multiple Roles of Ontologies in the BioMediator Data Integration System. In: Ludäscher, B., Raschid, L. (eds.) DILS 2005. LNCS (LNBI), vol. 3615, pp. 96–104. Springer, Heidelberg (2005)
Caragea, D., et al.: Algorithms and Software for Collaborative Discovery from Autonomous, Semantically Heterogeneous, Distributed Information Sources. In: Jain, S., Simon, H.U., Tomita, E. (eds.) ALT 2005. LNCS (LNAI), vol. 3734, pp. 13–44. Springer, Heidelberg (2005)
public-semweb-lifesci forum message from Benjamin H. Szekely, http://www.w3.org/mid/OFC5D7E901.5F3825EB.ON85257169.0060CA27-85257169.006B0FEE.us.ibm.com
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Marshall, M.S., Post, L., Roos, M., Breit, T.M. (2006). Using Semantic Web Tools to Integrate Experimental Measurement Data on Our Own Terms. In: Meersman, R., Tari, Z., Herrero, P. (eds) On the Move to Meaningful Internet Systems 2006: OTM 2006 Workshops. OTM 2006. Lecture Notes in Computer Science, vol 4277. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11915034_92
Download citation
DOI: https://doi.org/10.1007/11915034_92
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-48269-7
Online ISBN: 978-3-540-48272-7
eBook Packages: Computer ScienceComputer Science (R0)