Abstract
Current methods for data integration are as difficult to use as they are powerful. Motivated by our work with clinical data and the people who analyze it, we present two components that allow non-technical users that are domain experts to create and reuse complex data integration processes. The GUAVA (GUI As View Apparatus) component enables data analysts to make informed data integration decisions based on detailed accounts of the user interface that was used to generate the data. The MultiClass component allows analysts to revisit decisions made for prior studies and reuse them or not each time the data is used. We describe these two components with examples where a warehouse of clinical data is used to support research studies. We describe the state of our implementation and why we believe the two components can be automatically translated into ETL workflows.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Dhamankar, R., Lee, Y., Doan, A., Halevy, A., Domingos, P.: iMAP: discovering complex semantic matches between database schemas. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, Paris, France, pp. 383–394 (2004)
Dong, X., Halevy, A.Y.: A Platform for Personal Information Management and Integration. In: Proceedings of the Second Biennial Conference on Innovative Data Systems Research (CIDR), Asilomar, CA, USA, January 4-7, pp. 119–130 (2005)
Du, F., Amir-Yahia, S., Freire, J.: A comprehensive solution to the XML-to-relational mapping problem. In: Proceedings of the 6th Annual ACM International Workshop on Web Information and Data Management, Washington DC, November 12-13, pp. 31–38 (2004)
Evens, M.: Thesaural Relations in Information Retrieval. In: Green, R., Bean, C.A., Myaeng, S.H. (eds.) The Semantics of Relationships: An Interdisciplinary Perspective, pp. 143–160. Kluwer Academic Publishers, Dordrecht (2002)
Gingras, F., Lakshmanan, L.V.S.: nD-SQL: A multi-dimensional language for interoperability and OLAP. In: Proceedings of the 24th International Conference on Very Large Data Bases (VLDB), New York City, USA, pp. 134–145 (1998)
Larson, J.A., Navathe, S.B., Elmasri, R.: A Theory of Attribute Equivalence in Databases with Application to Schema Integration. IEEE Transactions on Software Engineering 15(4), 449–463 (1989)
Madhavan, J., Halevy, A.Y.: Composing Mappings Among Data Sources. In: Proceedings of the 29th International Conference on Very Large Data Bases (VLDB), Berlin, Germany, September 2003, pp. 572–583 (2003)
Miller, R.J.: Using Schematically Heterogeneous Structures. In: Proceedings of ACM SIGMOD, Seattle, WA, June 1998, vol. 27(2), pp. 189–200 (1998)
Miller, R.J., Hernandez, M.A., Haas, L.M., Yan, L.-L., Ho, C.T.H., Fagin, R., Popa, L.: The Clio Project: Managing Heterogeneity. SIGMOD Record 30(1), 78–83 (2001)
Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. In: Proceedings of the 27th International Conferences on Very Large Databases, vol. 10(4), pp. 334–350 (2001)
Sciore, E., Siegel, M., Rosenthal, A.: Using semantic values to facilitate interoperability among heterogeneous information systems. ACM Transactions on Database Systems 19(2), 254–290 (1994)
Spaccapietra, S., Parent, C., Dupont, Y.: Model independent assertions for integration of heterogeneous schemas. VLDB Journal 1, 81–126 (1992)
Spooner, D.L.: Towards an Object-Oriented Data Model for a Mechanical CAD Database System. In: Dittrich, K.R., Dayal, U., Buchmann, A.P. (eds.) On Object-Oriented Database Systems, pp. 189–205. Springer, Berlin (1991)
Vassiliadis, P., Simitsis, A., Georgantas, P., Terrovitis, M., Skiadopoulos, S.: A generic and customizable framework for the design of ETL scenarios. Information Systems 30(7), 492–525 (2005)
Wang, Y.R., Madnick, S.E.: The inter-database instance identification problem in integrating autonomous systems. In: Proceedings of the Fifth International Conference on Data Engineering (ICDE), Los Angeles, CA, February 6-10, pp. 46–55. IEEE Computer Society Press, Washington (1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Terwilliger, J.F., Delcambre, L.M.L., Logan, J. (2006). Context-Sensitive Clinical Data Integration. In: Grust, T., et al. Current Trends in Database Technology – EDBT 2006. EDBT 2006. Lecture Notes in Computer Science, vol 4254. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11896548_30
Download citation
DOI: https://doi.org/10.1007/11896548_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46788-5
Online ISBN: 978-3-540-46790-8
eBook Packages: Computer ScienceComputer Science (R0)