Abstract
In this paper we present a new technique for detecting changes on the Web. We propose a new method to measure the similarity of two documents, that can be efficiently used to discover changes in selected portions of the original document. The proposed technique has been implemented in the CDWeb system providing a change monitoring service on theWeb. CDWeb differs from other previously proposed systems since it allows the detection of changes on portions of documents and specific changes expressed by means of complex conditions, i.e. users might want to know if the value of a given stock has increased by more than 10%. Several tests on stock exchange and auction web pages proved the effectiveness of the proposed approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
S. Chawathe, A. Rajaraman, H. Garcia-Molina, and J. Widom Change detection in hierarchically structured information. In Proc. of the ACM SIGMOD Int. Conf. on Management of Data, pages 493–504, Montreal, Quebec, June 1996.
S. Chawathe, H. Garcia-Molina Meaningful change detection in structured data. In Proc. of the ACM SIGMOD Int. Conf. on Management of Data, pages 26–37, Tuscon, Arizona, May 1997.
S. Chawathe, S. Abiteboul, J. Widom Representing and querying changes in semistructured data. In Proc. of the Int. Conf. on Data Engeneering, pages 4–13, Orlando, Florida, February 1998
F. Douglis, T. Ball, Y. Chen, E. Koutsofios WebGuide: Querying and Navigating Changes in Web Repositories. In WWW5 / Computer Networks, 28(7-11), pages 1335–1344, 1996.
Fred Douglis, Thomas Ball: Tracking and Viewing Changes on the Web. In Proc. of USENIX Annual Technical Conference, pages 165–176, 1996.
F. Douglis, T. Ball, Y. Chen, and E. Koutsofios. The AT&T Internet Difference Engine: Tracking and Viewing Changes on the Web. In World Wide Web, 1(1), pages 27–44, Baltzer Science Publishers, 1998.
L. Liu, C. Pu, W. Tang, J. Biggs, D. Buttler, W. Han, P. Benninghoff, and Fenghua. CQ: A personalized update monitoring toolkit. In Proc. of the ACM SIGMOD Int. Conf. on Management of Data, 1998
L. Liu, C. Pu, W. Tang WebCQ-Detecting and delivering information changes on the web. In Proc. of CIKM’00, Washington, DC USA, 2000.
NetMind. http://www.netmind.com
TracerLock. http://www.peacefire.org/tracerlock
Wuu Yang. Identifying Syntactic differences Between Two Programs. In Software-Practice and Experience (SPE), 21(7), pp. 739–755, 1991.
J.T. Wang, K. Zhang and G. Chirn. Algorithms for Approximate Graph Matching. In Information Sciences 82(1-2), pp. 45–74, 1995.
Webwhacker. http://www.webwhacker.com
J. Widom and J. Ullman. C 3: Changes, consistency, and configurations in heterogeneous distributed information systems. Unpublished, available at http://wwwdb.stanford.edu/c3/synopsis.html, 1995
K. Zhang, J.T. Wang and D. Shasha. On the Editing Distance between Undirected Acyclic Graphs and Related Problems. In Proc. of Combinatorial Pattern Matching, pp. 395–407, 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Flesca, S., Furfaro, F., Masciari, E. (2001). Meaningful Change Detection on the Web⋆. In: Mayr, H.C., Lazansky, J., Quirchmayr, G., Vogel, P. (eds) Database and Expert Systems Applications. DEXA 2001. Lecture Notes in Computer Science, vol 2113. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44759-8_4
Download citation
DOI: https://doi.org/10.1007/3-540-44759-8_4
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42527-4
Online ISBN: 978-3-540-44759-7
eBook Packages: Springer Book Archive