Skip to main content
Log in

A Learning-Based Approach for Web Cache Management

  • Published:
Mobile Networks and Applications Aims and scope Submit manuscript

Abstract

Web caching has been widely used to alleviate Internet traffic congestion in World Wide Web (WWW) services. To reduce download throughput, an effective strategy on web cache management is needed to exploit web usage information in order to make a decision on evicting the document stored in case of cache saturation. This paper presents a so-called Learning Based Replacement algorithm (LBR), a hybrid approach towards an efficient replacement model for web caching by incorporating a machine learning technique (naive Bayes) into the LRU replacement method to improve prediction of possibility that an existing page will be revised by a succeeding request, from access history in a web log. The learned knowledge includes information on which URL objects in cache should be kept or evicted. The learning-based model is acquired to represent the hidden aspect of user request pattern for predicting the re-reference possibility. By a number of experiments, the LBR gains potential improvement of prediction on revisit probability, hit rate and byte hit rate overtraditional methods; LRU, LFU, and GDSF, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Arlitt M, Cherkasova L, Dilley J, Friedrich R, Jin T (2000) Evaluating content management techniques for web proxy caches. ACM SIGMETRICS Perform Eval Rev 27(4):3–11

    Article  Google Scholar 

  2. Bahn H, Noh S, Min L, Koh K (1999) Using full reference history for efficient document replacement in web caches. In: Proceedings of the 2nd USENIX symposium on internet technologies and systems, pp 187–196

  3. Balachander K, Jennifer R (2001) Web protocols and practice: HTTP/1.1, networking protocols, caching, and traffic measurement. Addison-Wesley

  4. Bian N, Chen H (2008) A least grade page replacement algorithm for web cache optimization. In: IEEE international workshop: KDD 2008. USA, pp 469–472

  5. Cao P, Felton EW, Karlin AR, Li K (1995) A study of integrated prefetching and caching strategies. In: ACM SIGMETRICS performace evaluation review. USA, pp 188–197

  6. Cao P, Irani S (1998) Cost-aware WWW proxy caching algorithms. In: USENIX systems. Monterey, pp 193–206

  7. Cherkasova L (1998) Improving WWW proxies performance with greedy-dual-size-frequency caching policy. HP Laboratories Report No. HPL-98-69R1

  8. Cherkasova L, Ciardo G (2001) Role of aging, frequency, and size in web cache replacement policies. In: Proceedings on high performance computing and networking, HPCN’01. Amsterdam, pp 25–27

  9. Colley R, Mobasher B, Srivastana J (1999) Data preparation for mining world wide web browsing patterns. Knowl Inf Syst (1):5–32

  10. Davison BD (2004) Learning web request patterns. Web Dynamics, pp 450–459

  11. Feng W, Vij K (2007) Machine learning prediction and web access modeling. In: Computer software and applications conference: COMPSAC vol 2. USA, pp 607–612

  12. Gery M, Hadddd H (2003) Evaluation of web usage mining approach for user’s next requests prediction. In: Proceedings of the 5th ACM international workshop on web information and data management. New Orleans, pp 74–81

  13. Huang Y, Hsu J (2008) Mining web logs to improve hit ratios of prefetching and caching. Knowl-Based Syst 21(1):62–69

    Article  MathSciNet  Google Scholar 

  14. Kaufman L, Rousseeuw PJ (1990) Finding groups in data an introduction to cluster analysis. Wiley Interscience, New York

    Google Scholar 

  15. Koskela T, Heikkonen J, Kaski K (2003) Web cache optimization with nonlinear model using object features. Comput Netw 43(6):805–817

    Article  MATH  Google Scholar 

  16. Lan B, Bressan S, Ooi BC, Tan KL (2000) Rule-assisted prefetching in web-server caching. In: Proceedings of the 9th knowledge management, pp 504–511

  17. NLANR (2010) National Lab of Applied Network Research (NLANR), sanitized AccessLogs. http://ircache.nlanr.net/Traces/

  18. Padmanabhan V, Mokul J (1991) Using predictive prefetching to improving www caching. In: The seventeenth international conference on very large database, pp 255–264

  19. Pitkow J, Pirolli P (1999) Mining longest repeating subsequences to predict world wide web surfing. In: Proceedings of the 2nd conference on USENIX symposium on internet technologies and systems. Boulder, pp 13–13

  20. Lorenzetti P, Rizzo L, Visicano L (1998) Replacement policies for a Proxy Cache. IEEE/ACM Trans Networking 158–170

  21. Rousskov A, Soloviev V (1998) On performance of caching proxies. In: Proceedings of SIGMETRICS’98, pp 272–273

  22. Sajeev GP, Sebastian MP (2010) Building a semi intelligent web cache with light weight machine learning. In: IEEE conference of intelligent systems, pp 420–425

  23. Shi Y, Watson E, Chen Y (1997) Model-driven simulation of world-wide-web cache policies. In: Proceeding of the 1997 winter simulation conference, pp 1045–1052

  24. Songwattana A, Sadananda R (2004) Clustering web objects using SOM for utilizing cache resources. In: PRECAI, Doctoral Forum, Auckland

  25. Songwattana A, Theeramunkong T (2008) Mining web logs for prediction in prefetching and caching. In: Proceedings of the 3rd IEEE international conference and workshop: ICCIT 2008, vol 2. Busan, pp 1006–1011

  26. Su Z, Yang Q, Lu Y, Zhang H (2000) WhatNext: a prediction system for web requests using N-gram sequence models. In: Proceedings of the first international conference on web information systems engineering (WISE’00), vol 1. IEEE, USA, pp 200–207

  27. Tian W, Choi B, Phoha VV (2002) An adaptive web cache access predictor using neural network. In: IEA/AIE’02: proceedings of the 15th international conference on industrial and engineering applications of artificial intelligence and expert systems. Springer, London, pp 450–459

  28. Wang J (1999) A survey of web caching schemes for the Internet. ACM SIGCOMM Comput Commun Rev Inc. USA 29(3)

  29. Wessels D (2001) Web caching. O’Reilly & Associates Inc., USA

    Google Scholar 

  30. Wu W, Lu H (2002) Efficient prediction of web accesses on a proxy server. In: The 11th ACM international conference on information and knowledge management, pp 169–176

  31. Yang Q, Zhang H, Li T (2011) Mining web logs for prediction models in WWW caching and prefetching. In: ACM international conference on proceedings of the 7th knowledge discovery and data mining, pp 473–478

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Phan Cong Vinh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Songwattana, A., Theeramunkong, T. & Vinh, P.C. A Learning-Based Approach for Web Cache Management. Mobile Netw Appl 19, 258–271 (2014). https://doi.org/10.1007/s11036-014-0498-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11036-014-0498-7

Keywords

Navigation