Prediction of Web Page Accesses by Proxy Server Log

Wu, Yi-Hung; Chen, Arbee L. P.

doi:10.1023/A:1015750423727

Prediction of Web Page Accesses by Proxy Server Log

Published: March 2002

Volume 5, pages 67–88, (2002)
Cite this article

World Wide Web Aims and scope Submit manuscript

Yi-Hung Wu¹ &
Arbee L. P. Chen¹

202 Accesses
18 Citations
Explore all metrics

Abstract

As the population of web users grows, the variety of user behaviors on accessing information also grows, which has a great impact on the network utilization. Recently, many efforts have been made to analyze user behaviors on the WWW. In this paper, we represent user behaviors by sequences of consecutive web page accesses, derived from the access log of a proxy server. Moreover, the frequent sequences are discovered and organized as an index. Based on the index, we propose a scheme for predicting user requests and a proxy-based framework for prefetching web pages. We perform experiments on real data. The results show that our approach makes the predictions with a high degree of accuracy with little overhead. In the experiments, the best hit ratio of the prediction achieves 75.69%, while the longest time to make a prediction only requires 2.3 ms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

R. Agrawal and R. Srikant, “Fast algorithms for mining association rules,” in Proceedings of VLDB Conference, 1994, pp. 487–499.
R. Agrawal and R. Srikant, “Mining sequential patterns,” in Proceedings of IEEE Conference on Data Engineering, 1995, pp. 3–14.
V. Almeida, A. Bestavros, M. Crovella, and A. Oliveira, “Characterizing reference locality in the WWW,” in Proceedings of IEEE Conference on Parallel and Distributed Information Systems, 1996, pp. 92–103.
A. Belloum and L. O. Hertzberger, “Scalable federation of Web cache servers,” World Wide Web 4, 2001, 255–275.
Google Scholar
P. Berkhin, J. D. Becher, and D. J. Randall, “Interactive path analysis of Web site traffic,” in Proceedings of ACM SIGKDD Conference, 2001, pp. 414–419.
A. Bestavros, “Speculative data dissemination and service to reduce server load, network traffic and service time for distributed information systems,” in Proceedings of IEEE Conference on Data Engineering, 1996, pp. 180–187.
A. Büchner and M. D. Mulvenna, “Discovering Internet marketing intelligence through online analytical Web usage mining,” in ACM SIGMOD Record 27(4), December 1998, 54–61.
Google Scholar
M. S. Chen, J. S. Park, and P. S. Yu, “Efficient data mining for path traversal patterns,” IEEE Transactions on Knowledge and Data Engineering 10(2), March/April 1998, 209–220.
Google Scholar
M. Crovella and P. Barford, “The network effects of prefetching,” in Proceedings of IEEE INFOCOM Conference, 1998.
M. Crovella and A. Bestavros, “Self-similarity in World Wide Web traffic: Evidence and possible causes,” in Proceedings of ACM SIGMETRICS Conference, May 1996.
C. R. Cunha and C. F. B. Jaccoud, “Determining WWW user's next access and its applications to prefetching,” in Proceedings of IEEE International Symposium on Computers and Communications, July 1997, pp. 1–3.
J. Griffioen and R. Appleton, “Automatic prefetching in a WAN,” in Proceedings of IEEE Workshop on Advances in Parallel and Distributed Systems, 1993.
T. Joachims, D. Freitag, and T. Mitchell, “WebWatcher: A tour guide for the World Wide Web,” in Proceedings of International Joint Conference on Artificial Intelligence, August 1997.
R. P. Klemm, “WebCompanion: A friendly client-side Web prefetching agent,” IEEE Transactions on Knowledge and Data Engineering 11(4), July/August 1999, 577–594.
Google Scholar
A. Kraiss and G. Weikum, “Integrated document caching and prefetching in storage hierarchies based on Markov-chain predictions,” VLDB Journal 7, 1998, 141–162.
Google Scholar
F. Masseglia, P. Poncelet, and M. Teisseire, “Using data mining techniques on Web access logs to dynamically improve hypertext structure,” ACM SIGWEB Newsletter 8(3), October 1999, 13–19.
Google Scholar
B. Mobasher, R. Cooley, and J. Srivastava, “Automatic personalization based on Web usage mining,” Communications of the ACM 43(8), August 2000, 142–151.
Google Scholar
J. S. Park, M. S. Chen, and P. S. Yu, “An effective hash based algorithm for mining association rules,” in Proceedings of ACM SIGMOD Conference, 1995, pp. 175–186.
M. Perkowitz and O. Etzioni, “Adaptive Web sites,” Communications of the ACM 43(8), August 2000, 152–158.
Google Scholar
C. Shahabi, A. M. Zarkesh, J. Adibi, and V. Shah, “Knowledge discovery from user Web-page navigation,” in Proceedings of Workshop on Research Issues in Data Engineering, 1997, pp. 20–29.
M. Spiliopoulou, “Web usage mining for Web site evaluation,” Communications of the ACM 43(8), August 2000, 127–134.
Google Scholar
J. Srivastava, R. Cooley, M. Deshpande, and P. N. Tan, “Web usage mining: Discovery and applications of usage patterns from Web data,” SIGKDD Explorations 1(2), 2000, 12–23.
Google Scholar
A. Vakali, “Proxy cache replacement algorithms: A history-based approach,” World Wide Web 4, 2001, 277–297.
Google Scholar
K. Wang, ”Discovering patterns from large and dynamic sequential,” Journal of Intelligent Information Systems 9, 1997, 33–56.
Google Scholar
Y. H. Wu, Y. H. Chen, and A. L. P. Chen, “Querying and browsing the resources in Internet,” in Proceedings of International Computer Symposium, 1996, pp. 9–16.
T. W. Yan, M. Jacobsen, H. Garcia-Molina, and U. Dayal, “From user access patterns to dynamic hypertext linking,” in Proceedings of International WWW Conference, May 1996.
Q. Yang and H. H. Zhang, “Integrating Web prefetching and caching using prediction models,” World Wide Web 4, 2001, 299–321.
Google Scholar
S. J. Yen and A. L. P. Chen, “An efficient approach to discovering knowledge from large databases,” in Proceedings of International Conference on Parallel and Distributed Information Systems, 1995, pp. 8–18.
O. R. Zaïane, M. Xin, and J. W. Han, “Discovering Web access patterns and trends by applying OLAP and data mining technology on Web logs,” in Proceedings of IEEE Conference on Advances in Digital Libraries, 1998, pp. 19–29.
A. M. Zarkesh, J. Adibi, C. Shahabi et al., “Analysis and design of server informative WWW-sites,” in Proceedings of ACM Conference on Information and Knowledge Management, 1997, pp. 254–261.

Download references

Author information

Authors and Affiliations

Department of Computer Science, National Tsing Hua University, Hsinchu, 300, Taiwan R.O.C
Yi-Hung Wu & Arbee L. P. Chen

Authors

Yi-Hung Wu
View author publications
You can also search for this author in PubMed Google Scholar
Arbee L. P. Chen
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, YH., Chen, A.L.P. Prediction of Web Page Accesses by Proxy Server Log. World Wide Web 5, 67–88 (2002). https://doi.org/10.1023/A:1015750423727

Download citation

Issue Date: March 2002
DOI: https://doi.org/10.1023/A:1015750423727

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Prediction of Web Page Accesses by Proxy Server Log

Abstract

Access this article

Similar content being viewed by others

Big data analytics on Apache Spark

Recommender Systems: Techniques, Applications, and Challenges

Advances in Collaborative Filtering

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Prediction of Web Page Accesses by Proxy Server Log

Abstract

Access this article

Similar content being viewed by others

Big data analytics on Apache Spark

Recommender Systems: Techniques, Applications, and Challenges

Advances in Collaborative Filtering

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation