Abstract
The integration of heterogeneous legacy databases requires understanding of database structure and content. We previously developed a theoretical and software infrastructure to support the extraction of schema and business rule information from legacy sources, combining database reverse engineering with semantic analysis of associated application code (DRE/SA). In this paper, we present a compact formalism called EITH that unifies the representation of database schema and application code. EITH can be efficiently derived from various types of schema representations, particularly the relational model, and supports comparison of a wide variety of schema and code constructs to enable interoperation. Unlike UML or E/R diagrams, for example, EITH has compact notation, is unambiguous, and uses a small set of efficient heuristics. We show how EITH is employed in the context of SEEK, using a construction project management example. We also show how EITH can represent various structures in relational databases, and can serve as an efficient representation for E/R diagrams. This suggests that EITH can support efficient matching of more complex, hierarchical structures via indexed tree representations, without compromising the EITH design philosophy or formalism.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Atkinson, D.C., Griswold, W.G.: The design of whole-program analysis tools. In: 18th International Conference on Software Engineering (1996)
Chawathe, S., Garcia-Molina, H., Hammer, J., Ireland, K., Papakonstantinou, Y., Ullman, J., Widom, J.: The TSIMMIS Project: Integration of Heterogeneous Information Sources. In: 10th Meeting of the Information Processing Society of Japan, Tokyo, Japan (1994)
Ferrante, J., Ottenstein, K.J., Warren, J.D.: The program dependence graph and its uses in optimization. ACM TOPLS 9, 319–349 (1987)
Haas, L., Miller, R.J., Niswonger, B., Roth, M.T., Schwarz, P.M., Wimmers, E.L.: Transforming heterogeneous data with database middleware: Beyond integration. IEEE Data Engineering Bulletin 22, 31–36 (1999)
Hall, M., Crummey, J.M.M., Carle, A., Rodriguez, R.G.: FIAT: A framework for interprocedural analysis and transformations. In: 6th Workshop on Languages and Compilers for Parallel Computing (1993)
Hammer, J., Breunig, M., Garcia-Molina, H., Nestorov, S., Vassalos, V., Yerneni, R.: Template-Based Wrappers in the TSIMMIS System. In: Twenty-Third ACM SIGMOD International Conference on Management of Data, Tucson, Arizona (1997)
Hammer, J., Garcia-Molina, H., Nestorov, S., Yerneni, R., Breunig, M., Vassalos, V.: Template-Based Wrappers in the TSIMMIS System. SIGMOD Record (ACM Special Interest Group on Management of Data) 26, 532–535 (1997)
Hammer, J., Pluempitiwiriyawej, C.: Element Matching across Data-oriented XML Sources using a Multi-strategy Clustering Technique. In: Data and Knowledge Engineering (DKE). Elsevier Science, Amsterdam (2004)
Hammer, J., Schmalz, M., O’Brien, W., Shekar, S., Haldavnekar, N.: SEEKing Knowledge in Legacy Information Systems to Support Interoperability. In: ECAI 2002 International Workshop on Ontologies and Semantic Interoperability, Lyon, France (2002)
O’Brien, W., Issa, R.R., Hammer, J., Schmalz, M.S., Geunes, J., Bai, S.X.: SEEK: Accomplishing Enterprise Information Integration Across Heterogeneous Sources. ITCON – Journal of Information Technology in Construction 7, 101–124 (2002)
Paul, S., Prakash, A.: A Framework for Source Code Search Using Program Patterns. Software Engineering 20, 463–475 (1994)
Pingali, K., Beck, M., Johnson, R., Moudgill, M., Stodghill, P.: Dependence Flow Graphs: An Algebraic Approach to Program Dependencies. In: 18th ACM Symposium on Principles of Programming Languages (1991)
Rational Software Corp., Unified Modeling Language Summary 1.1 (1997)
Shekar, S., Hammer, J., Schmalz, M.: Extracting Meaning from Legacy Code through Pattern Matching. Department of CISE, University of Florida, Gainesville, FL 32611–6120, TR03–003 (January 2003)
Sheth, A., Larson, J.A.: Federated Database Systems for Managing Distributed, Heterogeneous, and Autonomous Databases. ACM Computing Surveys 22, 183–236 (1990)
Wiederhold, G.: Weaving data into information. Database Programming and Design 11 (1998)
Willis, L., Newcomp, P.: Reverse Engineering. Kluwer, Boston (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Schmalz, M.S., Hammer, J., Wu, M., Topsakal, O. (2003). EITH – A Unifying Representation for Database Schema and Application Code in Enterprise Knowledge Extraction. In: Song, IY., Liddle, S.W., Ling, TW., Scheuermann, P. (eds) Conceptual Modeling - ER 2003. ER 2003. Lecture Notes in Computer Science, vol 2813. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39648-2_36
Download citation
DOI: https://doi.org/10.1007/978-3-540-39648-2_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20299-8
Online ISBN: 978-3-540-39648-2
eBook Packages: Springer Book Archive