Abstract
We present an analysis of the click-stream data with the aim to predict the next page that will be visited by an user based on a history of visited pages. We present one statistical method (based on Markov models) and two rule induction methods (first based on well known set covering approach, the other base on our compositional algorithm KEX). We compare the achieved results and discuss interesting patterns that appear in the data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Berka, P., Ivánek, J.: Automated knowledge acquisition for PROSPECTOR-like expert systems. In: Bergadano, F., De Raedt, L. (eds.) Machine Learning: ECML-94. LNCS, vol. 784, pp. 339–342. Springer, Heidelberg (1994)
Berka, P., Laš, V., Kočka, T.: Rule induction for click-stream analysis: set covering and compositional approach. In: IIPMW 2005. LNCS, pp. 13–22. Springer, Heidelberg (2005)
Bruha, I., Kočková, S.: A support for decision making: Cost-sensitive learning system. Artificial Intelligence in Medicine 6, 67–82 (1994)
Cooley, R., Tan, P.N., Srivastava, J.: Discovery of interesting usage patterns from web data. Tech. Rep. TR 99-022, Univ. of Minnesota (1999)
Clark, P., Niblett, T.: The CN2 induction algorithm. Machine Learning 3, 261–283 (1989)
Deshpande, M., Karypis, G.: Selective Markov Models for Predicting Web-Page Accesses. Technical Report 56, University of Minnesota (2000)
Gündüz, S., Özsu, M.T.: Recommendation Models for User Accesses to Web Pages. In: Kaynak, O., Alpaydın, E., Oja, E., Xu, L. (eds.) ICANN 2003 and ICONIP 2003. LNCS, vol. 2714, Springer, Heidelberg (2003)
Jelinek, F.: Statistical Methods for Speech Recognition. MIT Press, Cambridge (1998)
Jin, X., Mobasher, B., Zhou, Y.: A Web Recommendation System Based on Maximum Entropy. In: Proc. IEEE International Conference on Information Technology Coding and Computing, Las Vegas (2005)
Kosala, R., Blockeel, H.: Web Mining Research: A Survey. In: SIGKDD Explorations, vol. 2(1) (2000)
Michalski, R.S.: On the Quasi-minimal solution of the general covering problem. In: Proc. 5th Int. Symposium on Information Processing FCIP 1969, Bled, pp. 125–128 (1969)
Spiliopoulou, M., Faulstich, L.: WUM: A tool for web utilization analysis. In: Atzeni, P., Mendelzon, A.O., Mecca, G. (eds.) The World Wide Web and Databases. LNCS, vol. 1590, Springer, Heidelberg (1999)
Srivastava, J., Cooley, R., Deshpande, M., Tan, P.N.: Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data. SIGKDD Explorations 1(2) (2000)
Witten, I.H., Frank, E.: Generating Accurate Rule Sets Without Global Optimization. In: Proc. of the 15th Int. Conference on Machine Learning, Morgan Kaufmann, San Francisco (1998)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)
Zaiane, O., Han, J.: WebML: Querying the World-Wide Web for resources and knowledge. In: Workshop on Web Information and Data Management WIDM 1998, Bethesda, pp. 9–12 (1998)
Zaiane, O., Xin, M., Han, J.: Discovering web access patterns and trends by applying OLAP and data mining technology on web logs. In: Advances in Digital Libraries (1998)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Berka, P., Labský, M. (2007). Predicting Page Occurrence in a Click-Stream Data: Statistical and Rule-Based Approach. In: Perner, P. (eds) Advances in Data Mining. Theoretical Aspects and Applications. ICDM 2007. Lecture Notes in Computer Science(), vol 4597. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73435-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-540-73435-2_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73434-5
Online ISBN: 978-3-540-73435-2
eBook Packages: Computer ScienceComputer Science (R0)