Skip to main content

An Ontology-Driven Framework for Data Transformation in Scientific Workflows

  • Conference paper
Book cover Data Integration in the Life Sciences (DILS 2004)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 2994))

Included in the following conference series:

Abstract

Ecologists spend considerable effort integrating heterogeneous data for statistical analyses and simulations, for example, to run and test predictive models. Our research is focused on reducing this effort by providing data integration and transformation tools, allowing researchers to focus on “real science,” that is, discovering new knowledge through analysis and modeling. This paper defines a generic framework for transforming heterogeneous data within scientific workflows. Our approach relies on a formalized ontology, which serves as a simple, unstructured global schema. In the framework, inputs and outputs of services within scientific workflows can have structural types and separate semantic types (expressions of the target ontology). In addition, a registration mapping can be defined to relate input and output structural types to their corresponding semantic types. Using registration mappings, appropriate data transformations can then be generated for each desired service composition. Here, we describe our proposed framework and an initial implementation for services that consume and produce XML data.

This work supported in part by the National Science Foundation (NSF) grants ITR 0225676 (SEEK) and ITR 0225673 (GEON), and by DOE grant DE-FC02-01ER25486 (SciDAC-SDM).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ankolenkar, A., Burstein, M., Hobbs, J.R., Lassila, O., Martin, D.L., McDermott, D., McIlraith, S.A., Narayanan, S., Paolucci, M., Payne, T.R., Sycara, K.: DAML-S: Web service description for the semantic web. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, p. 348. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  2. Bhattacharyya, S.S., Cheong, E., Davis II, J., Goel, M., Hylands, C., Kienhuis, B., Lee, E.A., Liu, J., Liu, X., Muliadi, L., Neuendorffer, S., Reekie, J., Smyth, N., Tsay, J., Vogel, B., Williams, W., Xiong, Y., Zheng, H.: Heterogeneous concurrent modeling and design in java. Technical Report Memorandum UCB/ERL M02/23, EECS, University of California, Berkeley (August 2002)

    Google Scholar 

  3. Boag, S., Chamberlin, D., Fernández, M.F., Florescu, D., Robie, J., Siméon, J. (eds.): XQuery 1.0: An XML Query Language. W3C Working Draft. World Wide Web Consortium (W3C) (November 2003), http://www.w3.org/TR/2003/WD-xquery-20031112/

  4. Brickley, D., Guha, R.V. (eds.): RDF Vocabulary Description Language 1.0: RDF Schema. W3C Working Draft. World Wide Web Consortium (W3C) (February 2003), http://www.w3.org/TR/2003/WD-rdfschema-20030123/

  5. Begon, M., Harper, J.L., Townsend, C.R.: Ecology: Individuals, Populations, and Communities. Blackwell Science, Malden (1996)

    Google Scholar 

  6. Biron, P.V., Malhotra, A. (eds.): XML Schema Part 2: Datatypes. W3C Recommendation.WorldWideWeb Consortium (W3C) (May 2001), http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/

  7. Baader, F., Nutt, W.: Basic description logics. In: Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P.F. (eds.) The Description Logic Handbook: Theory, Implementation, and Applications, Cambridge University Press, Cambridge (2003)

    Google Scholar 

  8. Christensen, E., Curbera, F., Meredith, G., Weerawarana, S. (eds.): Web Services Description Language (WSDL) 1.1. W3C Note. World Wide Web Consortium (W3C) (March 2001), http://www.w3.org/TR/2001/NOTE-wsdl-20010315

  9. Clark, J., DeRose, S. (eds.): XML Path Language Version 1.0. W3C Recommendation. World Wide Web Consortium (W3C) (November 1999), http://www.w3.org/TR/1999/REC-xpath-19991116

  10. Cluet, S., Delobel, C., Siméon, J., Smaga, K.: Your mediators need data conversion! In: Proceedings of the SIGMOD International Conference on Management of Data, pp. 177–188. ACM Press, New York (1998)

    Google Scholar 

  11. Davidson, S.B., Kosky, A.: WOL: A language for database transformations and constraints. In: Proceedings of the 13th International Conference on Data Engineering (ICDE), pp. 55–65. IEEE Computer Society, Los Alamitos (1997)

    Chapter  Google Scholar 

  12. Krishnamurthy, R., Litwin, W., Kent, W.: Language features for interoperability of databases with schematic discrepancies. In: Proceedings of the SIGMOD International Conference on Management of Data, pp. 40–49. ACM Press, New York (1991)

    Google Scholar 

  13. Ludäscher, B., Gupta, A., Martone, M.E.: Modelbased mediation with domain maps. In: Proceedings of the 17th International Conference on Data Engineering (ICDE), April 2001, pp. 81–90. IEEE Computer Society, Los Alamitos (2001)

    Chapter  Google Scholar 

  14. Lee, E.A., Parks, T.M.: Dataflow process networks. Proceedings of the IEEE 83(5), 773–801 (1995)

    Article  Google Scholar 

  15. Lakshmanan, F.S.L.V.S.: Interoperability on XML data. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 146–163. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  16. Michener, W.K.: Building SEEK: the science environment for ecological knowledge. DataBits: An electronic newsletter for Information Managers, Spring Issue (2003)

    Google Scholar 

  17. Miklau, G., Suciu, D.: Containment and equivalence for an XPath fragment. In: Proceedings of the 21st Symposium on Principles of Database Systems (PODS), June 2002, pp. 65–76. ACM Press, New York (2002)

    Google Scholar 

  18. McGuinness, D.L., van Harmelen, F. (eds.): OWL Web Ontology Language Overview. W3C Candidate Recommendation. World Wide Web Consortium (W3C) (August 2003), http://www.w3.org/TR/2003/CR-owl-features-20030818/

  19. Papakonstantinou, Y., Abiteboul, S., Garcia-Molina, H.: Object fusion in mediator systems. In: Proceedings of 22nd International Conference on Very Large Data Bases (VLDB), September 1996, pp. 413–424. Morgan Kaufmann, San Francisco (1996)

    Google Scholar 

  20. Pottinger, R., Bernstein, P.A.: Merging models based on given correspondences. In: Proceedings of the 29th International Conference on Very Large Data Bases (VLDB), September 2003, pp. 826–837. Morgan Kaufmann, San Francisco (2003)

    Google Scholar 

  21. Parent, C., Spaccapietra, S.: Issues and approaches of database integration. Communications of the ACM 41(5), 166–178 (1998)

    Article  Google Scholar 

  22. Popa, L., Velegrakis, Y., Miller, R.J., Hernández, M., Fagin, R.: Translating Web data. In: Proceedings of the 28th International Conference on Very Large Data Bases, VLDB (2002)

    Google Scholar 

  23. Sciore, E., Siegel, M., Rosenthal, A.: Using semantic values to falilitate interoperability among heterogeneous information systems. ACM Transactions on Database Systems 19(2), 254–290 (1994)

    Article  Google Scholar 

  24. Thompson, H.S., Beech, D., Maloney, M., Mendelsohn, N. (eds.): XML Schema Part 1: Structures. W3C Recommendation. World Wide Web Consortium (W3C) (May 2001), http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/

  25. Ullman, J.D.: Information integration using logical views. In: Afrati, F.N., Kolaitis, P.G. (eds.) ICDT 1997. LNCS, vol. 1186, pp. 19–40. Springer, Heidelberg (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bowers, S., Ludäscher, B. (2004). An Ontology-Driven Framework for Data Transformation in Scientific Workflows. In: Rahm, E. (eds) Data Integration in the Life Sciences. DILS 2004. Lecture Notes in Computer Science(), vol 2994. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24745-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24745-6_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21300-0

  • Online ISBN: 978-3-540-24745-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics