Abstract
Previous works in change detection on XML documents are not suitable for detecting the changes to large XML documents as it requires a lot of memory to keep the two versions of XML documents in the memory. In this paper, we take a more conservative yet novel approach of using traditional relational database engines for detecting the changes to large unordered XML documents. We elaborate how we detect the changes on unordered XML documents by using relational database. To this end, we have implemented a prototype system called Xandy that converts XML documents into relational tuples and detects the changes from these tuples by using SQL queries. Our experimental results show that the relational approach has better scalability compared to published algorithms like X-Diff. The result quality of our approach is comparable to the one of X-Diff.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chen, Y., Madria, S., Bhowmick, S.S.: DiffXML: Change Detection in XML Data. In: Lee, Y., Li, J., Whang, K.-Y., Lee, D. (eds.) DASFAA 2004. LNCS, vol. 2973, pp. 289–301. Springer, Heidelberg (2004)
Curbera, D.A.E.: Fast Difference and Update of XML Documents. In: XTech 1999, San Jose (1999)
Cobena, G., Abiteboul, S., Marian, A.: Detecting Changes in XML Documents. In: ICDE 2002, San Jose (2002)
Jiang, H., Lu, H., Wang, W., Xu Yu, J.: Path Materialization Revisited: An Efficient Storage Model for XML Data. In: Australasian Database Conference, Melbourne, Australia (2002)
Leonardi, E., Bhowmick, S.S., Madria, S.: Detecting Content Changes on Ordered XML Documents Using Relational Databases. In: Galindo, F., Takizawa, M., Traunmüller, R. (eds.) DEXA 2004. LNCS, vol. 3180, pp. 580–590. Springer, Heidelberg (2004)
Leonardi, E., Bhowmick, S.S.: Xandy: Detecting Changes on Large Unordered XML Documents Using Relational Database. Technical Report, Center for Advanced Information System, Nanyang Technological University, Singapore (2004), http://www.cais.ntu.edu.sg/~erwin/docs/
Prakash, S., Bhowmick, S.S., Mardia, S.: SUCXENT: An Efficient Path-based Approach to Store and Query XML Documents. In: Galindo, F., Takizawa, M., Traunmüller, R. (eds.) DEXA 2004. LNCS, vol. 3180, pp. 285–295. Springer, Heidelberg (2004)
Shanmugasundaram, J., Tufte, K., Zhang, C., He, G., DeWitt, D.J., Naughton, J.F.: Relational Databases for Querying XML Documents: Limitations and Opportunities. The VLDB Journal (1999)
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal Molecular Biology 147, 195–197 (1981)
Wang, Y., DeWitt, D.J., Cai, J.: X-Diff: An Effective Change Detection Algorithm for XML Documents. In: ICDE 2003, Bangalore (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Leonardi, E., Bhowmick, S.S., Madria, S. (2005). Xandy: Detecting Changes on Large Unordered XML Documents Using Relational Databases. In: Zhou, L., Ooi, B.C., Meng, X. (eds) Database Systems for Advanced Applications. DASFAA 2005. Lecture Notes in Computer Science, vol 3453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11408079_65
Download citation
DOI: https://doi.org/10.1007/11408079_65
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25334-1
Online ISBN: 978-3-540-32005-0
eBook Packages: Computer ScienceComputer Science (R0)