Skip to main content
Log in

Mining user access patterns with traversal constraint for predicting web page requests

  • Short Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

The recent increase in HyperText Transfer Protocol (HTTP) traffic on the World Wide Web (WWW) has generated an enormous amount of log records on Web server databases. Applying Web mining techniques on these server log records can discover potentially useful patterns and reveal user access behaviors on the Web site. In this paper, we propose a new approach for mining user access patterns for predicting Web page requests, which consists of two steps. First, the Minimum Reaching Distance (MRD) algorithm is applied to find the distances between the Web pages. Second, the association rule mining technique is applied to form a set of predictive rules, and the MRD information is used to prune the results from the association rule mining process. Experimental results from a real Web data set show that our approach improved the performance over the existing Markov-model approach in precision, recall, and the reduction of user browsing time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD conference on management of data, Washington, D.C., pp 207–216

  2. Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the 11th international conference on data engineering, Taipei, Taiwan, pp 3–14

  3. Anderson C, Domingos P, Weld D (2002) Relational Markov models and their application to adaptive web navigation. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining, Edmonton, Canada, pp 143–152

  4. Baeza-Yates R, Ribeiro-Neto B (eds) (1999) Modern information retrieval. ACM Press, Addison Wesley

  5. Chen MS, Park JS, Yu PS (1998) Efficient data mining for path traversal patterns. IEEE Trans Knowl Data Eng 10(2):209–221

    Article  Google Scholar 

  6. Cooley R, Mobasher B, Srivastava J (1999) Data preparation for mining world wide web browsing patterns. Knowl Inf Syst 1(1):5–32

    Google Scholar 

  7. Deshpande M, Karypis G (2001) Selective Markov models for predicting web-page accesses. In: Proceedings of the 1st SIAM international conference on data mining, Chicago, IL

  8. Haruechaiyasak C, Shyu ML, Chen SC (2005) A web-page recommender system via a data mining framework and the semantic web concept. Int J Comput Applic Technol (in press)

  9. Lin W, Alvarez S, Ruiz C (2002) Efficient adaptive-support association rule mining for recommender systems. Data Mining and Knowledge Dis 6(1):83–105

    Article  MathSciNet  Google Scholar 

  10. Mobasher B, Dai H, Luo T, Nakagawa M (2002a) Using sequential and nonsequential patterns for predictive web usage mining tasks. In: Proceedings of the IEEE international conference on data mining, Maebashi City, Japan

  11. Mobasher B, Dai H, Tao M (2002b) Discovery and evaluation of aggregate usage profiles for web personalization. Data Mining Knowl Dis 6(1):61–82

    Article  Google Scholar 

  12. Padmanabhan VN, Mogul JC (1996) Using predictive prefetching to improve world wide web latency. ACM SIGCOMM Comput Commun Rev 26(3):22–36

    Article  Google Scholar 

  13. Pitkow J, Pirolli P (1999) Mining longest repeating subsequences to predict world wide web surfing. In: Proceedings of the 2nd USENIX Symposium on internet technologies and systems, Boulder, CO, pp 139–150

  14. Schechter S, Krishnan M, Smith MD (1998) Using path profiles to predict HTTP requests. Comput Networks ISDN Syst 30(1–7):457–467

    Article  Google Scholar 

  15. Shyu ML, Haruechaiyasak C, Chen SC (2003) Category Cluster Discovery from Distributed WWW Directories. J Inf Sci (Special issue on knowledge discovery from distributed information sources) 155(3–4):181–197

    Google Scholar 

  16. Shyu ML, Chen SC, Haruechaiyasak C (2001) Mining user access behavior on the www. In: Proceedings of the IEEE international conference on systems, man, and cybernetics, Tucson, AZ, pp 1717–1722

  17. Srivasta J, Cooley R, Deshpande M, Tan P (2000) Web usage mining: discovery and applications of usage patterns from web data. SIGKDD Explor (1)2:12–23

    Google Scholar 

  18. Tan P, Kumar V (2002) Discovery of web robot sessions based on their navigational patterns. Data Mining Knowl Discov 6(1):9–35

    Article  MathSciNet  Google Scholar 

  19. Yang Y (1999) An evaluation of statistical approaches to text categorization. J Inf Retr 1(1/2):67–88

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mei-Ling Shyu.

Additional information

Mei-Ling Shyu received her Ph.D. degree from the School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN in 1999, and three Master's degrees from Computer Science, Electrical Engineering, and Restaurant, Hotel, Institutional, and Tourism Management from Purdue University. She has been an Associate Professor in the Department of Electrical and Computer Engineering (ECE) at the University of Miami (UM), Coral Gables, FL, since June 2005, Prior to that, she was an Assistant Professor in ECE at UM dating from January 2000. Her research interests include data mining, multimedia database systems, multimedia networking, database systems, and security. She has authored and co-authored more than 120 technical papers published in various prestigious journals, refereed conference/symposium/workshop proceedings, and book chapters. She is/was the guest editor of several journal special issues.

Choochart Haruechaiyasak received his Ph.D. degree from the Department of Electrical and Computer Engineering, University of Miami, in 2003 with the Outstanding Departmental Graduating Student award from the College of Engineering. After receiving his degree, he has joined the National Electronics and Computer Technology Center (NECTEC), located in Thailand Science Park, as a researcher in Information Research and Development Division (RDI). His current research interests include data/ text/ Web mining, Natural Language Processing, Information Retrieval, Search Engines, and Recommender Systems. He is currently leading a small group of researchers and programmer to develop an open-source search engine for Thai language. One of his objectives is to promote the use of data mining technology and other advanced applications in Information Technology in Thailand. He is also a visiting lecturer for Data Mining, Artificial Intelligence and Decision Support Systems courses in many universities in Thailand.

Shu-Ching Chen received his Ph.D. from the School of Electrical and Computer Engineering at Purdue University, West Lafayette, IN, USA in December, 1998. He also received Master's degrees in Computer Science, Electrical Engineering, and Civil Engineering from Purdue University. He has been an Associate Professor in the School of Computing and Information Sciences (SCIS), Florida International University (FIU) since August, 2004. Prior to that, he was an Assistant Professor in SCIS at FIU dating from August, 1999. His main research interests include distributed multimedia database systems and multimedia data mining. Dr. Chen has authored and co-authored more than 140 research papers in journals, refereed conference/symposium/workshop proceedings, and book chapters. In 2005, he was awarded the IEEE Systems, Man, and Cybernetics Society's Outstanding Contribution Award. He was also awarded a University Outstanding Faculty Research Award from FIU in 2004, Outstanding Faculty Service Award from SCIS in 2004 and Outstanding Faculty Research Award from SCIS in 2002.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shyu, ML., Haruechaiyasak, C. & Chen, SC. Mining user access patterns with traversal constraint for predicting web page requests. Knowl Inf Syst 10, 515–528 (2006). https://doi.org/10.1007/s10115-006-0004-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-006-0004-z

Keywords

Navigation