Abstract
Schema mapping produces a semantic correspondence between two schemas. Automating schema mapping is challenging. The existence of 1:n (or n:1) and n:m mapping cardinalities makes the problem even harder. Recently, we have studied automated schema mapping techniques (using data frames and domain ontology snippets) that not only address the traditional 1:1 mapping problem, but also the harder 1:n and n:m mapping problems. Experimental results show that the approach can achieve excellent precision and recall. In this paper, we share our experiences and lessons we have learned during our schema mapping studies.
- J. Biskup and D. Embley. Extracting information from heterogeneous information sources using ontologically specified target views. Information Systems, 28(3):169--212, May 2003.]] Google ScholarDigital Library
- S. Castano, V. D. Antonellis, and S. D. C. di Vemercati. Global viewing of heterogeneous data sources. IEEE Transaction of Data Knowledge Engineering, 13(2):277--297, 2001.]] Google ScholarDigital Library
- R. Dhamankar, Y. Lee, A. Doan, A. Halevy, and P. Domingos. iMAP: Discovering complex matches between database schemas. In Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data (SIGMOD 2004), Paris, France, June 2004.]] Google ScholarDigital Library
- A. Doan, P. Domingos, and A. Halevy. Reconciling schemas of disparate data sources: A machine-learning approach. In Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data (SIGMOD'01), pages 509--520, Santa Barbara, California, May 2001.]] Google ScholarDigital Library
- A. Doan, J. Madhavan, R. Dhamankar, P. Domingos, and A. Halevy. Learning to match ontologies on the semantic web. The VLDB Journal, 12(4):303--319, November 2003.]] Google ScholarDigital Library
- D. Embley. Programming with data frames for everyday data items. In Proceedings of AFIPS National Computer Conference (NCC'80), pages 301--305, Anaheim, California, May 1980.]]Google ScholarDigital Library
- D. Embley, D. Campbell, Y. Jiang, S. Liddle, D. Lonsdale, Y.-K. Ng, and R. Smith. Conceptual-model-based data extraction from multiple-record Web pages. Data & Knowledge Engineering, 31(3):227--251, November 1999.]] Google ScholarDigital Library
- D. Embley, D. Jackman, and L. Xu. Multifaceted exploitation of metadata for attribute match discovery in information integration. In Proceedings of the International Workshop on Information Integration on the Web (WIIW'01), pages 110--117, Rio de Janeiro, Brazil, April 2001. An extended version of this paper appeared in Journal of the Brazilian Computing Society, 8(2):32--43, November, 2002.]]Google Scholar
- D. Embley, C. Tao, and S. Liddle. Automatically extracting ontologically specified data from HTML tables with unknown structure. In Proceedings of the 21st International Conference on Conceptual Modeling (ER'02), pages 322--327, Tampere, Finland, October 2002. An extended version of this paper is to appear in Data & Knowledge Engineering.]] Google ScholarDigital Library
- W. Li and C. Clifton. Semint: a tool for identifying attribute correspondences in heterogeneous databases using neural network. Data Knowledge Engineering, 33(1):49--84, 2000.]] Google ScholarDigital Library
- J. Madhavan, P. Bernstein, and E. Rahm. Generic schema matching with cupid. In Proceedings of the International Conference on Very Large Databases (VLDB), pages 49--58, Rome, Italy, September 2001.]] Google ScholarDigital Library
- D. Melnik, H. Molina-Garcia, and E. Rahm. Similarity flooding: a versatile graph matching algorithm. In Proceedings of the International Conference on Data Engineering (ICDE), pages 117--128, San Jose, California, February 2002.]] Google ScholarDigital Library
- L. Popa, Y. Velegrakis, M. Hernandez, R. Miller, and R. Fagin. Translating web data. In Proceedings of the 28th International Conference on Very Large Databases (VLDB'02), pages 598--609, Hong Kong, China, August 2002.]] Google ScholarDigital Library
- E. Rahm and P. Bernstein. A survey of approaches to automatic schema matching. The VLDB Journal, 10(4):334--350, December 2001.]] Google ScholarDigital Library
- P. Spyns, R. Meersman, and M. Jarrar. Data modeling versus ontology engineering. SIGMOD Record, 31(4):12--17, December 2002.]] Google ScholarDigital Library
- L. Xu. Source Discovery and Schema Mapping for Data Integration. Brigham Young University, Provo, Utah, 2003. PhD Dissertation.]] Google ScholarDigital Library
- L. Xu and D. Embley. Discovering direct and indirect matches for schema elements. In Proceedings of the 8th International Conference on Database Systems for Advanced Applications (DASFAA 2003), pages 39--46, Kyoto, Japan, March 2003.]] Google ScholarDigital Library
- L. Xu and D. Embley. Using domain ontologies to discover direct and indirect matches for schema elements. In Proceedings of the Semantic Integration Workshop Collocated with the Second International Semantic Web Conference (ISWC-03), Sanibel Island, Florida, October 2003.]] Google ScholarDigital Library
Recommendations
An Alogrithm for Indirect Schema Mapping Composition
ETCS '09: Proceedings of the 2009 First International Workshop on Education Technology and Computer Science - Volume 03here are a large number of indirect schema mappings between peers in the network. To improve the efficiency of data exchange and queries, indirect mappings are needed to be composed. Defined the combination operations of schema elements in indirect ...
Structural characterizations of schema-mapping languages
ICDT '09: Proceedings of the 12th International Conference on Database TheorySchema mappings are declarative specifications that describe the relationship between two database schemas. In recent years, there has been an extensive study of schema mappings and of their applications to several different data inter-operability tasks,...
A mapping schema and interface for XML stores
WIDM '02: Proceedings of the 4th international workshop on Web information and data managementMost XML storage efforts have focused on mapping documents to relational databases. Mapping choices range from storing documents verbatim to shredding documents into relations in various ways. These choices are usually hard-coded into each storage ...
Comments