Predicting Page Occurrence in a Click-Stream Data: Statistical and Rule-Based Approach

Berka, Petr; Labský, Martin

doi:10.1007/978-3-540-73435-2_11

Petr Berka¹ &
Martin Labský¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4597))

Included in the following conference series:

Industrial Conference on Data Mining

749 Accesses

Abstract

We present an analysis of the click-stream data with the aim to predict the next page that will be visited by an user based on a history of visited pages. We present one statistical method (based on Markov models) and two rule induction methods (first based on well known set covering approach, the other base on our compositional algorithm KEX). We compare the achieved results and discuss interesting patterns that appear in the data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Berka, P., Ivánek, J.: Automated knowledge acquisition for PROSPECTOR-like expert systems. In: Bergadano, F., De Raedt, L. (eds.) Machine Learning: ECML-94. LNCS, vol. 784, pp. 339–342. Springer, Heidelberg (1994)
Google Scholar
Berka, P., Laš, V., Kočka, T.: Rule induction for click-stream analysis: set covering and compositional approach. In: IIPMW 2005. LNCS, pp. 13–22. Springer, Heidelberg (2005)
Google Scholar
Bruha, I., Kočková, S.: A support for decision making: Cost-sensitive learning system. Artificial Intelligence in Medicine 6, 67–82 (1994)
Article Google Scholar
Cooley, R., Tan, P.N., Srivastava, J.: Discovery of interesting usage patterns from web data. Tech. Rep. TR 99-022, Univ. of Minnesota (1999)
Google Scholar
Clark, P., Niblett, T.: The CN2 induction algorithm. Machine Learning 3, 261–283 (1989)
Google Scholar
Deshpande, M., Karypis, G.: Selective Markov Models for Predicting Web-Page Accesses. Technical Report 56, University of Minnesota (2000)
Google Scholar
Gündüz, S., Özsu, M.T.: Recommendation Models for User Accesses to Web Pages. In: Kaynak, O., Alpaydın, E., Oja, E., Xu, L. (eds.) ICANN 2003 and ICONIP 2003. LNCS, vol. 2714, Springer, Heidelberg (2003)
Google Scholar
Jelinek, F.: Statistical Methods for Speech Recognition. MIT Press, Cambridge (1998)
Google Scholar
Jin, X., Mobasher, B., Zhou, Y.: A Web Recommendation System Based on Maximum Entropy. In: Proc. IEEE International Conference on Information Technology Coding and Computing, Las Vegas (2005)
Google Scholar
Kosala, R., Blockeel, H.: Web Mining Research: A Survey. In: SIGKDD Explorations, vol. 2(1) (2000)
Google Scholar
Michalski, R.S.: On the Quasi-minimal solution of the general covering problem. In: Proc. 5th Int. Symposium on Information Processing FCIP 1969, Bled, pp. 125–128 (1969)
Google Scholar
Spiliopoulou, M., Faulstich, L.: WUM: A tool for web utilization analysis. In: Atzeni, P., Mendelzon, A.O., Mecca, G. (eds.) The World Wide Web and Databases. LNCS, vol. 1590, Springer, Heidelberg (1999)
Chapter Google Scholar
Srivastava, J., Cooley, R., Deshpande, M., Tan, P.N.: Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data. SIGKDD Explorations 1(2) (2000)
Google Scholar
Witten, I.H., Frank, E.: Generating Accurate Rule Sets Without Global Optimization. In: Proc. of the 15th Int. Conference on Machine Learning, Morgan Kaufmann, San Francisco (1998)
Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)
Google Scholar
Zaiane, O., Han, J.: WebML: Querying the World-Wide Web for resources and knowledge. In: Workshop on Web Information and Data Management WIDM 1998, Bethesda, pp. 9–12 (1998)
Google Scholar
Zaiane, O., Xin, M., Han, J.: Discovering web access patterns and trends by applying OLAP and data mining technology on web logs. In: Advances in Digital Libraries (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information and Knowledge Engineering, University of Economics, W. Churchill Sq. 4, 130 67 Prague, Czech Republic
Petr Berka & Martin Labský

Authors

Petr Berka
View author publications
You can also search for this author in PubMed Google Scholar
Martin Labský
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Berka, P., Labský, M. (2007). Predicting Page Occurrence in a Click-Stream Data: Statistical and Rule-Based Approach. In: Perner, P. (eds) Advances in Data Mining. Theoretical Aspects and Applications. ICDM 2007. Lecture Notes in Computer Science(), vol 4597. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73435-2_11

Download citation

DOI: https://doi.org/10.1007/978-3-540-73435-2_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73434-5
Online ISBN: 978-3-540-73435-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics