Abstract
Ecologists spend considerable effort integrating heterogeneous data for statistical analyses and simulations, for example, to run and test predictive models. Our research is focused on reducing this effort by providing data integration and transformation tools, allowing researchers to focus on “real science,” that is, discovering new knowledge through analysis and modeling. This paper defines a generic framework for transforming heterogeneous data within scientific workflows. Our approach relies on a formalized ontology, which serves as a simple, unstructured global schema. In the framework, inputs and outputs of services within scientific workflows can have structural types and separate semantic types (expressions of the target ontology). In addition, a registration mapping can be defined to relate input and output structural types to their corresponding semantic types. Using registration mappings, appropriate data transformations can then be generated for each desired service composition. Here, we describe our proposed framework and an initial implementation for services that consume and produce XML data.
This work supported in part by the National Science Foundation (NSF) grants ITR 0225676 (SEEK) and ITR 0225673 (GEON), and by DOE grant DE-FC02-01ER25486 (SciDAC-SDM).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ankolenkar, A., Burstein, M., Hobbs, J.R., Lassila, O., Martin, D.L., McDermott, D., McIlraith, S.A., Narayanan, S., Paolucci, M., Payne, T.R., Sycara, K.: DAML-S: Web service description for the semantic web. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, p. 348. Springer, Heidelberg (2002)
Bhattacharyya, S.S., Cheong, E., Davis II, J., Goel, M., Hylands, C., Kienhuis, B., Lee, E.A., Liu, J., Liu, X., Muliadi, L., Neuendorffer, S., Reekie, J., Smyth, N., Tsay, J., Vogel, B., Williams, W., Xiong, Y., Zheng, H.: Heterogeneous concurrent modeling and design in java. Technical Report Memorandum UCB/ERL M02/23, EECS, University of California, Berkeley (August 2002)
Boag, S., Chamberlin, D., Fernández, M.F., Florescu, D., Robie, J., Siméon, J. (eds.): XQuery 1.0: An XML Query Language. W3C Working Draft. World Wide Web Consortium (W3C) (November 2003), http://www.w3.org/TR/2003/WD-xquery-20031112/
Brickley, D., Guha, R.V. (eds.): RDF Vocabulary Description Language 1.0: RDF Schema. W3C Working Draft. World Wide Web Consortium (W3C) (February 2003), http://www.w3.org/TR/2003/WD-rdfschema-20030123/
Begon, M., Harper, J.L., Townsend, C.R.: Ecology: Individuals, Populations, and Communities. Blackwell Science, Malden (1996)
Biron, P.V., Malhotra, A. (eds.): XML Schema Part 2: Datatypes. W3C Recommendation.WorldWideWeb Consortium (W3C) (May 2001), http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/
Baader, F., Nutt, W.: Basic description logics. In: Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P.F. (eds.) The Description Logic Handbook: Theory, Implementation, and Applications, Cambridge University Press, Cambridge (2003)
Christensen, E., Curbera, F., Meredith, G., Weerawarana, S. (eds.): Web Services Description Language (WSDL) 1.1. W3C Note. World Wide Web Consortium (W3C) (March 2001), http://www.w3.org/TR/2001/NOTE-wsdl-20010315
Clark, J., DeRose, S. (eds.): XML Path Language Version 1.0. W3C Recommendation. World Wide Web Consortium (W3C) (November 1999), http://www.w3.org/TR/1999/REC-xpath-19991116
Cluet, S., Delobel, C., Siméon, J., Smaga, K.: Your mediators need data conversion! In: Proceedings of the SIGMOD International Conference on Management of Data, pp. 177–188. ACM Press, New York (1998)
Davidson, S.B., Kosky, A.: WOL: A language for database transformations and constraints. In: Proceedings of the 13th International Conference on Data Engineering (ICDE), pp. 55–65. IEEE Computer Society, Los Alamitos (1997)
Krishnamurthy, R., Litwin, W., Kent, W.: Language features for interoperability of databases with schematic discrepancies. In: Proceedings of the SIGMOD International Conference on Management of Data, pp. 40–49. ACM Press, New York (1991)
Ludäscher, B., Gupta, A., Martone, M.E.: Modelbased mediation with domain maps. In: Proceedings of the 17th International Conference on Data Engineering (ICDE), April 2001, pp. 81–90. IEEE Computer Society, Los Alamitos (2001)
Lee, E.A., Parks, T.M.: Dataflow process networks. Proceedings of the IEEE 83(5), 773–801 (1995)
Lakshmanan, F.S.L.V.S.: Interoperability on XML data. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 146–163. Springer, Heidelberg (2003)
Michener, W.K.: Building SEEK: the science environment for ecological knowledge. DataBits: An electronic newsletter for Information Managers, Spring Issue (2003)
Miklau, G., Suciu, D.: Containment and equivalence for an XPath fragment. In: Proceedings of the 21st Symposium on Principles of Database Systems (PODS), June 2002, pp. 65–76. ACM Press, New York (2002)
McGuinness, D.L., van Harmelen, F. (eds.): OWL Web Ontology Language Overview. W3C Candidate Recommendation. World Wide Web Consortium (W3C) (August 2003), http://www.w3.org/TR/2003/CR-owl-features-20030818/
Papakonstantinou, Y., Abiteboul, S., Garcia-Molina, H.: Object fusion in mediator systems. In: Proceedings of 22nd International Conference on Very Large Data Bases (VLDB), September 1996, pp. 413–424. Morgan Kaufmann, San Francisco (1996)
Pottinger, R., Bernstein, P.A.: Merging models based on given correspondences. In: Proceedings of the 29th International Conference on Very Large Data Bases (VLDB), September 2003, pp. 826–837. Morgan Kaufmann, San Francisco (2003)
Parent, C., Spaccapietra, S.: Issues and approaches of database integration. Communications of the ACM 41(5), 166–178 (1998)
Popa, L., Velegrakis, Y., Miller, R.J., Hernández, M., Fagin, R.: Translating Web data. In: Proceedings of the 28th International Conference on Very Large Data Bases, VLDB (2002)
Sciore, E., Siegel, M., Rosenthal, A.: Using semantic values to falilitate interoperability among heterogeneous information systems. ACM Transactions on Database Systems 19(2), 254–290 (1994)
Thompson, H.S., Beech, D., Maloney, M., Mendelsohn, N. (eds.): XML Schema Part 1: Structures. W3C Recommendation. World Wide Web Consortium (W3C) (May 2001), http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/
Ullman, J.D.: Information integration using logical views. In: Afrati, F.N., Kolaitis, P.G. (eds.) ICDT 1997. LNCS, vol. 1186, pp. 19–40. Springer, Heidelberg (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bowers, S., Ludäscher, B. (2004). An Ontology-Driven Framework for Data Transformation in Scientific Workflows. In: Rahm, E. (eds) Data Integration in the Life Sciences. DILS 2004. Lecture Notes in Computer Science(), vol 2994. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24745-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-540-24745-6_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21300-0
Online ISBN: 978-3-540-24745-6
eBook Packages: Springer Book Archive