Abstract
The DTD of a set of XML documents may change due to many reasons such as changes to the real world events, changes to the user’s requirements, and mistakes in the initial design. In this paper, we present a novel algorithm called DTD-Diff to detect the changes to DTDs that defines the structure of a set of XML documents. Such change detection tool can be useful in several ways such as maintenance of XML documents, incremental maintenance of relational schema for storing XML data, and XML schema integration. We compare DTD-Diff with existing XML change detection approaches and show that converting DTD to XML Schema (XSD) (which is in XML document format) and detecting the changes using existing XML change detection algorithms is not a feasible option. Our experimental results show that DTD-Diff is 5–325 times faster than X-Diff when it detects the changes to the XSD files. We also study the result quality of detected deltas.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Rivest, R.L.: The MD5 Message Digest Algorithm. Internet RFC 1321 (April 1992), http://www.faqs.org/rfcs/rfc1321.html
UW XML Repository. Database Research Group, University of Washington, http://www.cs.washington.edu/research/xmldatasets/
XML Schema. World Wide Web Consortium. http://www.w3.org/XML/Schema
XML. ORG Registry and Repository for XML Schemas. http://www.xml.org/xml/registry.jsp
Choi, B.: What are real DTDs like? In: WebDB (2002)
Cobena, G., Abiteboul, S., Marian, A.: Detecting Changes in XML Documents. In: ICDE (2002)
Leonardi, E., Bhowmick, S.S.: Detecting Changes on XML Documents Using Relational Databases: A Schema-Conscious Approach. In: ACM CIKM (2005)
Leonardi, E., Hoai, T.T., Bhowmick, S.S., Madria, S.: DTD-Diff: A Change Detection Algorithm for DTDs. Technical Report, Center for Advanced Information System, Nanyang Technological University, Singapore (2005)
Shanmugasundaram, J., Tufte, K., Zhang, C., He, G., DeWitt, D.J., Naughton, J.F.: Relational Databases for Querying XML Documents: Limitations and Opportunities. In: VLDB (1999)
Wang, Y., DeWitt, D.J., Cai, J.: X-Diff: An Effective Change Detection Algorithm for XML Documents. In: ICDE (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Leonardi, E., Hoai, T.T., Bhowmick, S.S., Madria, S. (2006). DTD-Diff: A Change Detection Algorithm for DTDs. In: Li Lee, M., Tan, KL., Wuwongse, V. (eds) Database Systems for Advanced Applications. DASFAA 2006. Lecture Notes in Computer Science, vol 3882. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11733836_59
Download citation
DOI: https://doi.org/10.1007/11733836_59
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33337-1
Online ISBN: 978-3-540-33338-8
eBook Packages: Computer ScienceComputer Science (R0)