Abstract p]We study web caching when the input sequence is a depth first search traversal of some tree. There are at least two good motivations for investigating tree traversal as a search technique on the WWW: First, empirical studies of people browsing and searching the WWW have shown that user access patterns commonly are nearly depth first traversals of some tree. Secondly, (as we will show in this paper) the problem of visiting all the pages on some WWW site using anchor clicks (clicks on links) and back button clicks — by far the two most common user actions — reduces to the problem of how to best cache a tree traversal sequence (up to constant factors).
We show that for tree traversal sequences the optimal offline strategy can be computed efficiently. In the bit model, where the access time of a page is proportional to its size, we show that the online algorithm LRU is (1 + 1/∈)-competitive against an adversary with unbounded cache as long as LRU has a cache of size at least (1 + ∈) times the size of the largest item in the input sequence. In the general model, where pages have arbitrary access times and sizes, we show that in order to be constant competitive, any online algorithm needs a cache large enough to store Ω (log n) pages; here n is the number of distinct pages in the input sequence. We provide a matching upper bound by showing that the online algorithm Landlord is constant competitive against an adversary with an unbounded cache if Landlord has a cache large enough to store the Ω(log n) largest pages. This is further theoretical evidence that Landlord is the “ght” algorithm for web caching.
Supported in part by NSF Grant CCR-9734927 and by ASOSR grant F49620010011.
Supported by the START program Y43-MAT of the Austrian Ministry of Science.
Supported in part by NSF Grant CCR-9734927 and by ASOSR grant F49620010011.
Supported by the START program Y43-MAT of the Austrian Ministry of Science.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Albers, S. Arora, and S. Khanna, “Page replacement for general caching problems”, ACM/SIAM Symposium on Discrete Algorithms, 31–40, 1999.
A. Borodin, S. Irani, P. Raghavan, and B. Schieber, “Competitive paging with locality of reference”, Journal of Computer and System Sciences 50, 244–258, 1995.
P. Cao and S. Irani, “Cost-aware WWW proxy caching algorithms”, USENIX Symposium on Internet Technologies and Systems, 193–206, 1997.
L. Catledge and J. Pitkow, “Characterizing browsing strategies in the world wide web”, Computer Networks and ISDN Systems 27, 1065–1073, 1995.
E. Cohen, and H. Kaplan, “Caching documents with variable sizes and fetching costs: an LP based approach”, ACM/SIAM Symposium on Discrete Algorithms, S879–S880, 1999.
R.E. Ladner, J.D. Fix, and A. LaMarca, “Cache performance analysis of traversal and random accesses”, ACM/SIAM Symposium on Discrete Algorithms, 613–622, 1999.
B. Huberman, P. Pirolli, J. Pitkow, and R. Lukose, “Strong regularities in world wide web surfing”, Science 280, 95–97, 1998.
S. Irani, “Page replacement with multi-size pages and applications to web caching”, ACM Symposium on Theory of Computing, 701–710, 1997.
B. Jiang, “DFS-traversing graphs in a paging environment, LRU or MRU”, Information Processing Letters 40, 193–196, 1991.
B. Kalyanasundaram and K. Pruhs, “Constructing competitive tours from local information”, Theoretical Computer Science 130, 125–138, 1994.
A. Karlin, S. Phillips, and P. Raghavan, ”Markov paging”, IEEE Symposium on Foundations of Computer Science, 208–217, 1992.
L. Tauscher and S. Greenberg, “How people revisit web pages: empirical findings and implications for the design of history systems”, International Journal of Human-Computer Studies 47, 97–137, 1997.
N. Young, “The k-server dual and loose competitiveness”, Algorithmica 11, 525–541, 1994.
N. Young, “On-line file caching”, ACM/SIAM Symposium on Discrete Algorithms, 82–86, 1998.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kalyanasundaram, B., Noga, J., Pruhs, K., Woeginger†, G. (2000). Caching for Web Searching. In: Algorithm Theory - SWAT 2000. SWAT 2000. Lecture Notes in Computer Science, vol 1851. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44985-X_14
Download citation
DOI: https://doi.org/10.1007/3-540-44985-X_14
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67690-4
Online ISBN: 978-3-540-44985-0
eBook Packages: Springer Book Archive