Caching for Web Searching

Kalyanasundaram, Bala; Noga, John; Pruhs, Kirk; Woeginger†, Gerhard

doi:10.1007/3-540-44985-X_14

Bala Kalyanasundaram⁴,
John Noga⁵,
Kirk Pruhs⁶ &
…
Gerhard Woeginger†⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1851))

Included in the following conference series:

Scandinavian Workshop on Algorithm Theory

994 Accesses

Abstract p]We study web caching when the input sequence is a depth first search traversal of some tree. There are at least two good motivations for investigating tree traversal as a search technique on the WWW: First, empirical studies of people browsing and searching the WWW have shown that user access patterns commonly are nearly depth first traversals of some tree. Secondly, (as we will show in this paper) the problem of visiting all the pages on some WWW site using anchor clicks (clicks on links) and back button clicks — by far the two most common user actions — reduces to the problem of how to best cache a tree traversal sequence (up to constant factors).

We show that for tree traversal sequences the optimal offline strategy can be computed efficiently. In the bit model, where the access time of a page is proportional to its size, we show that the online algorithm LRU is (1 + 1/∈)-competitive against an adversary with unbounded cache as long as LRU has a cache of size at least (1 + ∈) times the size of the largest item in the input sequence. In the general model, where pages have arbitrary access times and sizes, we show that in order to be constant competitive, any online algorithm needs a cache large enough to store Ω (log n) pages; here n is the number of distinct pages in the input sequence. We provide a matching upper bound by showing that the online algorithm Landlord is constant competitive against an adversary with an unbounded cache if Landlord has a cache large enough to store the Ω(log n) largest pages. This is further theoretical evidence that Landlord is the “ght” algorithm for web caching.

Supported in part by NSF Grant CCR-9734927 and by ASOSR grant F49620010011.

Supported by the START program Y43-MAT of the Austrian Ministry of Science.

Supported in part by NSF Grant CCR-9734927 and by ASOSR grant F49620010011.

Supported by the START program Y43-MAT of the Austrian Ministry of Science.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

S. Albers, S. Arora, and S. Khanna, “Page replacement for general caching problems”, ACM/SIAM Symposium on Discrete Algorithms, 31–40, 1999.
Google Scholar
A. Borodin, S. Irani, P. Raghavan, and B. Schieber, “Competitive paging with locality of reference”, Journal of Computer and System Sciences 50, 244–258, 1995.
Article MATH MathSciNet Google Scholar
P. Cao and S. Irani, “Cost-aware WWW proxy caching algorithms”, USENIX Symposium on Internet Technologies and Systems, 193–206, 1997.
Google Scholar
L. Catledge and J. Pitkow, “Characterizing browsing strategies in the world wide web”, Computer Networks and ISDN Systems 27, 1065–1073, 1995.
Article Google Scholar
E. Cohen, and H. Kaplan, “Caching documents with variable sizes and fetching costs: an LP based approach”, ACM/SIAM Symposium on Discrete Algorithms, S879–S880, 1999.
Google Scholar
R.E. Ladner, J.D. Fix, and A. LaMarca, “Cache performance analysis of traversal and random accesses”, ACM/SIAM Symposium on Discrete Algorithms, 613–622, 1999.
Google Scholar
B. Huberman, P. Pirolli, J. Pitkow, and R. Lukose, “Strong regularities in world wide web surfing”, Science 280, 95–97, 1998.
Article Google Scholar
S. Irani, “Page replacement with multi-size pages and applications to web caching”, ACM Symposium on Theory of Computing, 701–710, 1997.
Google Scholar
B. Jiang, “DFS-traversing graphs in a paging environment, LRU or MRU”, Information Processing Letters 40, 193–196, 1991.
Article MATH MathSciNet Google Scholar
B. Kalyanasundaram and K. Pruhs, “Constructing competitive tours from local information”, Theoretical Computer Science 130, 125–138, 1994.
Article MATH MathSciNet Google Scholar
A. Karlin, S. Phillips, and P. Raghavan, ”Markov paging”, IEEE Symposium on Foundations of Computer Science, 208–217, 1992.
Google Scholar
L. Tauscher and S. Greenberg, “How people revisit web pages: empirical findings and implications for the design of history systems”, International Journal of Human-Computer Studies 47, 97–137, 1997.
Article Google Scholar
N. Young, “The k-server dual and loose competitiveness”, Algorithmica 11, 525–541, 1994.
Article MathSciNet Google Scholar
N. Young, “On-line file caching”, ACM/SIAM Symposium on Discrete Algorithms, 82–86, 1998.
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science, Georgetown University, Washington D.C., 20057, USA
Bala Kalyanasundaram
Department of Mathematics, Technical University of Graz, Graz, Austria
John Noga & Gerhard Woeginger†
Dept. of Computer Science, University of Pittsburgh, Pittsburgh, PA., 15260, USA
Kirk Pruhs

Authors

Bala Kalyanasundaram
View author publications
You can also search for this author in PubMed Google Scholar
John Noga
View author publications
You can also search for this author in PubMed Google Scholar
Kirk Pruhs
View author publications
You can also search for this author in PubMed Google Scholar
Gerhard Woeginger†
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kalyanasundaram, B., Noga, J., Pruhs, K., Woeginger†, G. (2000). Caching for Web Searching. In: Algorithm Theory - SWAT 2000. SWAT 2000. Lecture Notes in Computer Science, vol 1851. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44985-X_14

Download citation

DOI: https://doi.org/10.1007/3-540-44985-X_14
Published: 15 March 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67690-4
Online ISBN: 978-3-540-44985-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics