Skip to main content

Approximate Common Structures in XML Schema Matching

  • Conference paper
Advances in Web-Age Information Management (WAIM 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3739))

Included in the following conference series:

  • 767 Accesses

Abstract

This paper describes a matching algorithm that can find accurate matches and scales to large XML Schemas with hundreds of nodes. We model XML Schemas as labeled, unordered and rooted trees, and turn the schema matching problem into a tree matching problem. We develop a tree matching algorithm based on the concept of Approximate Common Structures. Compared with the tree edit-distance algorithm and other Schema matching systems, our algorithm is faster and more suitable for large XML Schema matching.

The work has been supported by NSERC, CITO, NSERC CRD and NCE Auto 21.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Do, H., Rahm, E.: COMA A System for Flexible Combination of Schema Matching Approaches. In: VLDB 2002 (2002)

    Google Scholar 

  2. Doan, A., Domingos, P., Halevy, A.: Reconciling Schemas of Disparate Data Sources: A Machine-learning Approach. In: Proc. SIGMOD Conference (2001)

    Google Scholar 

  3. Gupta, A., Nishimura, N.: Finding Largest Subtrees and Smallest Supertrees. Algorithmica 21, 183–210 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  4. Lu, J., Wang, J., Wang, S.: An Experiment on the Matching and Reuse of XML Schemas. In: Lowe, D.G., Gaedke, M. (eds.) ICWE 2005. LNCS, vol. 3579, pp. 273–284. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  5. Madhavan, J., Bernstein, P.A., Rahm, E.: Generic Schema Matching with Cupid. In: VLDB 2001 (2001)

    Google Scholar 

  6. Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application to Schema Matching. In: ICDE 2002 (2002)

    Google Scholar 

  7. Mitra, P., Wiederhold, G., Kersten, M.: A Graph-oriented Model for Articulation of Ontology Interdependencies. In: Zaniolo, C., Grust, T., Scholl, M.H., Lockemann, P.C. (eds.) EDBT 2000. LNCS, vol. 1777, pp. 86–100. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  8. Rahm, E., Bernstein, P.A.: A Survey of Approaches to Automatic Schema Matching. VLDB J. 10(4), 334–350 (2001)

    Article  MATH  Google Scholar 

  9. Schilieder, T., Naumann, F.: Approximate Tree Embedding for Querying XML Data. In: ACM SIGIR 2000 Workshop On XML and Information Retrieval, Athens, Greece, July 28 (2000)

    Google Scholar 

  10. Shasha, D., Wang, J., Zhang, K., Shih, F.Y.: Exact and Approximate Algorithms for Unordered Tree Matching. IEEE Trans. on Sys., Man, and Cyber. 24(4) (April 1994)

    Google Scholar 

  11. Su, H., Padmanabhan, S., Lo, M.: Identification of Syntactically Similar DTD Elements for Schema Matching. In: Wang, X.S., Yu, G., Lu, H. (eds.) WAIM 2001. LNCS, vol. 2118, p. 145. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  12. Wang, J., Shapiro, B.A., Shasha, D., Zhang, K., Currey, K.: An Algo. for Finding the Largest Approxi. Common Substructures of Two Trees. IEEE Trans. PAMI 20, 889–895 (1998)

    Article  Google Scholar 

  13. Yao, J.T., Zhang, M.: A Fast Tree Pattern Matching Algorithm for XML Query. In: Proc. of the IEEE/WIC/ACM Int. Conf. on Web Intelligence, Beijing, September 20-24, pp. 235–241 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, S., Lu, J., Wang, J. (2005). Approximate Common Structures in XML Schema Matching. In: Fan, W., Wu, Z., Yang, J. (eds) Advances in Web-Age Information Management. WAIM 2005. Lecture Notes in Computer Science, vol 3739. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11563952_100

Download citation

  • DOI: https://doi.org/10.1007/11563952_100

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29227-2

  • Online ISBN: 978-3-540-32087-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics