skip to main content
article

Automatic direct and indirect schema mapping: experiences and lessons learned

Published:01 December 2004Publication History
Skip Abstract Section

Abstract

Schema mapping produces a semantic correspondence between two schemas. Automating schema mapping is challenging. The existence of 1:n (or n:1) and n:m mapping cardinalities makes the problem even harder. Recently, we have studied automated schema mapping techniques (using data frames and domain ontology snippets) that not only address the traditional 1:1 mapping problem, but also the harder 1:n and n:m mapping problems. Experimental results show that the approach can achieve excellent precision and recall. In this paper, we share our experiences and lessons we have learned during our schema mapping studies.

References

  1. J. Biskup and D. Embley. Extracting information from heterogeneous information sources using ontologically specified target views. Information Systems, 28(3):169--212, May 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Castano, V. D. Antonellis, and S. D. C. di Vemercati. Global viewing of heterogeneous data sources. IEEE Transaction of Data Knowledge Engineering, 13(2):277--297, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Dhamankar, Y. Lee, A. Doan, A. Halevy, and P. Domingos. iMAP: Discovering complex matches between database schemas. In Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data (SIGMOD 2004), Paris, France, June 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Doan, P. Domingos, and A. Halevy. Reconciling schemas of disparate data sources: A machine-learning approach. In Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data (SIGMOD'01), pages 509--520, Santa Barbara, California, May 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Doan, J. Madhavan, R. Dhamankar, P. Domingos, and A. Halevy. Learning to match ontologies on the semantic web. The VLDB Journal, 12(4):303--319, November 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Embley. Programming with data frames for everyday data items. In Proceedings of AFIPS National Computer Conference (NCC'80), pages 301--305, Anaheim, California, May 1980.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Embley, D. Campbell, Y. Jiang, S. Liddle, D. Lonsdale, Y.-K. Ng, and R. Smith. Conceptual-model-based data extraction from multiple-record Web pages. Data & Knowledge Engineering, 31(3):227--251, November 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Embley, D. Jackman, and L. Xu. Multifaceted exploitation of metadata for attribute match discovery in information integration. In Proceedings of the International Workshop on Information Integration on the Web (WIIW'01), pages 110--117, Rio de Janeiro, Brazil, April 2001. An extended version of this paper appeared in Journal of the Brazilian Computing Society, 8(2):32--43, November, 2002.]]Google ScholarGoogle Scholar
  9. D. Embley, C. Tao, and S. Liddle. Automatically extracting ontologically specified data from HTML tables with unknown structure. In Proceedings of the 21st International Conference on Conceptual Modeling (ER'02), pages 322--327, Tampere, Finland, October 2002. An extended version of this paper is to appear in Data & Knowledge Engineering.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. W. Li and C. Clifton. Semint: a tool for identifying attribute correspondences in heterogeneous databases using neural network. Data Knowledge Engineering, 33(1):49--84, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Madhavan, P. Bernstein, and E. Rahm. Generic schema matching with cupid. In Proceedings of the International Conference on Very Large Databases (VLDB), pages 49--58, Rome, Italy, September 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Melnik, H. Molina-Garcia, and E. Rahm. Similarity flooding: a versatile graph matching algorithm. In Proceedings of the International Conference on Data Engineering (ICDE), pages 117--128, San Jose, California, February 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. L. Popa, Y. Velegrakis, M. Hernandez, R. Miller, and R. Fagin. Translating web data. In Proceedings of the 28th International Conference on Very Large Databases (VLDB'02), pages 598--609, Hong Kong, China, August 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. E. Rahm and P. Bernstein. A survey of approaches to automatic schema matching. The VLDB Journal, 10(4):334--350, December 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. Spyns, R. Meersman, and M. Jarrar. Data modeling versus ontology engineering. SIGMOD Record, 31(4):12--17, December 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. Xu. Source Discovery and Schema Mapping for Data Integration. Brigham Young University, Provo, Utah, 2003. PhD Dissertation.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. Xu and D. Embley. Discovering direct and indirect matches for schema elements. In Proceedings of the 8th International Conference on Database Systems for Advanced Applications (DASFAA 2003), pages 39--46, Kyoto, Japan, March 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. L. Xu and D. Embley. Using domain ontologies to discover direct and indirect matches for schema elements. In Proceedings of the Semantic Integration Workshop Collocated with the Second International Semantic Web Conference (ISWC-03), Sanibel Island, Florida, October 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM SIGMOD Record
    ACM SIGMOD Record  Volume 33, Issue 4
    December 2004
    92 pages
    ISSN:0163-5808
    DOI:10.1145/1041410
    Issue’s Table of Contents

    Copyright © 2004 Authors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 1 December 2004

    Check for updates

    Qualifiers

    • article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader