Abstract
Schema matching, the problem of finding semantic correspondences between elements of source and warehouse schemas, plays a key role in data warehousing. Currently, the mappings are largely determined manually by domain experts, thus a time-consuming process. In this paper, based on a multistrategy schema matching framework, we develop a linguistic matching algorithm using semantic distances between words to compute their semantic similarity, and propose a structural matching algorithm based on semantic similarity propagation. After describe our approach, we present experimental results on several real-world domains, and show that the algorithm discovers semantic mappings with a high degree of accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bernstein, P.A., halevy, A., Pottinger, R.A.: A Vision for Management of Complex Models. In: SIGMOD 2000 (2000)
Bechhofer, S., Goble, C.A.: Delivering Terminological Services. Journal of AI*IA Notizie 12(1), 27–32 (1999)
Doan, A., Domingos, P., Halevy, A.: Reconciling Schemas of Disparate Data Sources: A Machine-Learning approach. In: SIGMOD 2001 (2001)
Do, H., Rahm, E.: COMA. A System for flexible combination of schema matching approaches. In: VLDB 2002 (2002)
Embley, D.W., et al.: Multifaceted Exploitation of Metadata for attribute Match Discovery in information Integration. In: WIIW (2001)
Miller, G.A.: WordNet: A lexical database for English. Communications of the ACM, 38(11) 39, 39–41 (1995)
Carr, L.A., Hall, W., Bechhofer, S., Goble, C.A.: Conceptual Linking: Ontology-based Open Hypermedia. In: Proceedings of the Tenth International World Wide Web Conference, Hong Kong, May 2001, pp. 334–342 (2001)
Li, W.: Clifton: SemInt: A Tool for Identifying Attribute Correspondences in Heterogeneous database Using Neural Network. Data & Knowledge Engineering (2001)
Madhavant, J., Bernstein, P.A., Rahm, E.: Generic Schema Matching with Cupid. In: VLDB 2001 (2001)
Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity Flooding: A versatile graph matching Algorithm. In: ICDE 2002 (2002)
Mitra, P., Wiederhold, G., Jannink, J.: Semiautomatic integration of knowledge sources. In: FUSION 1999 (1999)
Lassila, O., Swick, R.: Resource Description Framework (RDF) Model and Syntax Specification (1998), http://www.w3.org/TR/REC-rdf-syntax/
Palopoli, L., Terracina, G., Ursino, D.: The system DIKE: towards the semi-automatic synthesis of cooperative information systems and data warehouse. In: ADBIS-DASFAA Conf. 2000 (2000)
Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. The VLDB Journal 10(4), 334–350 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cheng, W., Lin, H., Sun, Y. (2005). An Efficient Schema Matching Algorithm. In: Khosla, R., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2005. Lecture Notes in Computer Science(), vol 3682. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11552451_134
Download citation
DOI: https://doi.org/10.1007/11552451_134
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28895-4
Online ISBN: 978-3-540-31986-3
eBook Packages: Computer ScienceComputer Science (R0)