Abstract
The World Wide Web is growing and changing at an astonishing rate. For the information in the web to be useful, web information systems such as search engines have to keep up with the growth and change of the web. In this paper we study how web documents change. In particular, we study two important characteristics of web document change that are directly related to keeping web information systems up-to-date: the degree of the change and the clusteredness of the change. We analyze the evolution of web documents with respect to these two measures and discuss the implications for web information systems update.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
B. Brewington and G. Cybenko. How dynamic is the web? In Proceedings of the Ninth International World Wide Web Conference, May 2000.
B. Brewington and G. Cybenko. Keeping up with the changing web. IEEE Computer, 33(5):52–58, May 2000.
J. Cho and H. Garcia-Molina. Estimating frequency of change. Submitted for publication, 2000.
J. Cho and H. Garcia-Molina. The evolution of the web and implications for an incremental crawler. 26th International Conference on Very Large Data Bases, September 2000.
F. Douglis, A. Feldmann, B. Krishnamurthy, and J. Mogul. Rate of change and other metrics: A live study of the world wide web. Proceedings of the USENIX Symposium on Internet and Systems, 1997.
S. Lawrence and C. L. Giles. Accessibility of information on the web. Nature, 400:107–109, 1999.
A. Tomasic, H. Garcia-Molina, and K. Shoens. Incremental updates of inverted lists for text document retrieval. Proceedings of 1994 ACM International Conference of Management of Data (SIGMOD), pages 289–300, May 1994.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lim, L., Wang, M., Padmanabhan, S., Vitter, J.S., Agarwal, R. (2001). Characterizing Web Document Change. In: Wang, X.S., Yu, G., Lu, H. (eds) Advances in Web-Age Information Management. WAIM 2001. Lecture Notes in Computer Science, vol 2118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47714-4_13
Download citation
DOI: https://doi.org/10.1007/3-540-47714-4_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42298-3
Online ISBN: 978-3-540-47714-3
eBook Packages: Springer Book Archive